Automated Syllabus of Applied Empirical Results Papers

Built by Rex W. Douglass @RexDouglass ; Github ; LinkedIn

Papers curated by hand, summaries and taxonomy written by LLMs.

Submit paper to add for review

Improving Accuracy and Robustness in Political Analysis

> Enhancing Election Forecasting through Advanced Analytic Techniques

>> Considerations for Mail Voting Data Interpretation

Differentiate between the causes of non-return rates in vote-at-home states versus no-excuse absentee ballot states when measuring lost votes due to mail balloting, as abstention is likely a dominant factor driving non-returns in vote-at-home states, rendering the non-return rate uninformative about lost votes in those states. (Agresti and Presnell 2002)
Carefully consider the mode of voting (mail vs in-person) when analyzing voter behavior, as it has become an increasingly salient factor in shaping partisan attitudes towards election integrity and policy preferences. (“Harvard Data Science Review,” n.d.)

>> Minimal Assumptions Approaches for Increased Credibility

Consider multiple analytical approaches, including those that make minimal assumptions, to ensure robustness and credibility of your findings. (Imai and King 2004)

>> Addressing Challenges in Election Predictions with Novel Approaches

Incorporate multiple sources of information, including historical fundamentals and contemporary polls, into a hierarchical Bayesian framework to make accurate and robust election forecasts while avoiding overconfidence caused by simple poll averages. (Heidemanns, Gelman, and Morris 2020)
Consider the role of incentives in shaping forecasts, particularly in light of challenges in evaluating forecast calibration and communication, as demonstrated by the difficulties encountered in creating, communicating, and evaluating election predictions. (Gelman et al. 2020)
Consider the potential impact of differential nonresponse rates among different groups of respondents, particularly in situations where there is a high degree of polarization and mistrust towards pollsters and the media. (Gelman and Azari 2017)
Consider the possibility of “specification uncertainty” when building statistical models for election forecasting, which arises from the fact that different plausible models may produce varying predictions, leading to potentially biased estimates if only one model is selected. (NA?)

>> Multilevel Regression Poststratification (MRP) for Survey Data

Consider using multilevel regression and poststratification (MRP) to improve the accuracy of estimates derived from non-representative samples, especially when dealing with large datasets containing rich demographic information. (W. Wang et al. 2015)
Consider using multilevel regression and poststratification (MRP) to estimate voter turnout and vote choice within deeply interacted subgroups, while incorporating deeper levels of covariate interaction, allowing for nonlinearity and nonmonotonicity, accounting for unequal inclusion probabilities, making postestimation adjustments, and utilizing informative multidimensional graphical displays for model checking. (Ghitza and Gelman 2013)
Carefully consider the role of interactions when analyzing survey data, especially when using regression models, as failing to do so may lead to biased estimates due to the potential dependence of predictors on survey inclusion probabilities. (Gelman 2007a)
Consider combining multilevel regression modeling with poststratification techniques to improve the accuracy of state-level estimates derived from national polls, especially when dealing with sparse data in specific demographic and geographical subgroups. (Park, Gelman, and Bafumi 2004)

>> Optimizing Data Analysis Strategies for Complex Relationships

Consider discretizing continuous predictors into three categories (with roughly 1/3 of the data in each category) instead of two, as this approach can improve the interpretability of results for non-technical audiences while maintaining around 80-90% efficiency compared to linear regression. (Gelman and Park 2009)
Avoid making inferences about individual-level relationships based solely on aggregate data, and instead employ multilevel modeling techniques to capture the complexity of relationships across different levels of analysis. (Gelman 2007b)

>> Combining Poll Data with Contextual Factors for Better Predictions

Avoid relying solely on historical poll data when making predictions about future elections, as these polls may follow a random walk pattern rather than indicating a clear trend towards one candidate or another. Instead, researchers might consider incorporating additional factors such as economic conditions, demographic changes, and district-level analyses to improve the accuracy of your forecasts. (Gelman et al. 2005)

> Avoiding Bias and Inconsistencies in Electoral Studies

>> Distinguishing Concepts and Models for Unbiased Estimates

Avoid collapsing multiparty electoral data into a binary format, as doing so leads to biased and incomplete inferences. Instead, they should employ statistical models specifically tailored to handle multiparty data structures, such as the proposed model, which accounts for the unique features of these systems and allows for accurate estimation of contextual effects. (Katz and King 1999)
Consider multiple perspectives and potential sources of bias when analyzing the impact of redistricting on electoral responsiveness and partisan bias, including the intentions of the redistricters, the timing of redistricting events, and the presence of confounding factors such as incumbency and turnout. (Gelman and King 1994)
Carefully distinguish and estimate separately the concepts of partisan bias and democratic representation in studying the relationship between legislative seats and citizen votes, as they are related but distinct concepts that can lead to inconsistent statistical estimates if conflated. (G. King and Browning 1987)
Differentiate between theoretical concepts and your empirical measurements, allowing them to estimate uncertainty around your estimates using statistical techniques. (NA?)

>> Measuring Partisan Gerrymandering with Legal Constraints

Consider using the partisan symmetry criterion as a useful tool for measuring the burden on representational rights in the context of partisan gerrymandering claims, although it should not be the only factor considered and its implementation may require careful consideration of whether it should be applied retrospectively or prospectively. (Grofman and King 2007)

>> Measuring Incumbency Advantage Unbiasedly in Congressional Elections

Avoid using incumbent votes as a proxy for incumbency advantage, as it leads to biased estimates due to the confounding effect of partisan predispositions. Instead, researchers should aim to measure incumbency advantage directly and use exogenous measures of constituency service, such as legislative operating budgets, to estimate its impact accurately. (G. King 1991a)
Carefully account for the potential impact of incumbency advantage when studying electoral outcomes, as it can significantly affect measures of electoral responsiveness and partisan bias. (G. King and Gelman 1991)
Avoid using biased or inconsistent measures of incumbency advantage in congressional elections, such as sophomore surge and retirement slump, and instead employ an unbiased estimator based on a simple linear regression model. (Gelman and King 1990)

>> Addressing Assumptions

Carefully evaluate the plausibility of the continuity assumption in Geographic Regression Discontinuity (GRD) designs, as precise sorting of agents around geographic boundaries can undermine its validity and compromise the credibility of the design. (L. J. Keele and Titiunik 2015)
Carefully evaluate the plausibility of the local geographic ignorability assumption (Assumption 4) by examining the balance of pretreatment covariates across treatment and control groups in increasingly narrow bands around the geographic boundary, and perform the analysis in the band where the covariates are indistinguishable in both areas. (L. Keele and Titiunik 2015)
Carefully consider and justify your methodological assumptions when comparing the directional and proximity models of voter decision making, as these assumptions significantly impact the results and conclusions drawn from the data. (Lewis and King 1999)

>> Leveraging Natural Experiments and Addressing Confounders

Carefully consider the potential for confounding variables when analyzing data from a randomized natural experiment, particularly when only one randomization is performed for each event, and employ appropriate statistical methods to address this issue. (Daniel E. Ho and Imai 2008)
Leverage the exact randomization procedures inherent in natural experiments, like the California alphabet lottery for ballot ordering, to derive accurate nonparametric confidence intervals using methods such as Fishers exact test and randomization inference. (Daniel E. Ho and Imai 2006)

> Contextualizing Voter Perceptions in Natural Disaster Evaluation

>> Accounting for Historical Context and Subjective Factors

Be cautious about assuming a straightforward link between objective indicators of incumbent performance and voters subjective perceptions of your well-being, as demonstrated by the finding that voters punished incumbent president Woodrow Wilson for a series of shark attacks in New Jersey in 1916. (Fowler and Hall 2018)
Consider historical context and changing voter expectations when analyzing the effects of natural disasters on elections, as demonstrated by the finding that the 1927 Mississippi Flood had a significantly negative impact on Herbert Hoovers 1928 presidential campaign despite his involvement in relief efforts. (Heersink, Peterson, and Jenkins 2017)

> Ecological Inference and Race Prediction

>> Limitations of King

Exercise caution when using Kings method for ecological inference, as it often produces inaccurate results and unreliable diagnostics, especially compared to simpler models such as the neighborhood model and ecological regression. (Schuessler 1999)

>> Considerations for Analyzing Complex Contingency Tables

Carefully consider and address the potential impact of distributional effects, contextual effects, and aggregation effects when making inferences from ecological data, as these factors can significantly affect the accuracy and reliability of estimates. (Imai, Lu, and Strauss 2007)
Carefully consider the trade-offs between computational complexity and statistical efficiency when choosing between Bayesian and frequentist approaches for analyzing ecological inference problems involving R x C contingency tables, especially when incorporating covariates. (Rosen et al. 2001)

>> Enhancing Racial Classification Using Multiple Data Sources

Employ a fully Bayesian approach to address census measurement error and utilize supplementary name data to enhance the accuracy of race imputation, particularly for racial minorities. (Rosenman, Olivella, and Imai 2022)
Consider extending traditional Bayesian methods for predicting individual race using surname lists by incorporating multiple sources of information, such as geolocation, demographics, and party registration, to improve the accuracy of your predictions. (Imai and Khanna 2016)

> Innovations in Ideal Point Estimation Techniques

>> Non-Parametric Unfolding for Binary Choice Data Classification

Use a non-parametric unfolding technique to accurately classify binary choice data, which involves solving two subproblems: finding a cutting plane that optimally separates the data into two categories based on a set of chooser points, and determining the optimal location of each chooser point given a set of cutting planes. (Poole 2000)

>> Addressing Challenges in Dimensionality Reduction and Model Identification

Carefully consider identifiability issues in logistic regression models, particularly in the presence of additive and multiplicative aliasing, and employ methods such as hierarchical modeling, linear transformations, and informative regression predictors to address these challenges. (Bafumi et al. 2005)
Consider using dynamic item response models with Bayesian inference for estimating ideal points in political science, especially when there is potential for preference change over time. (Martin and Quinn 2002)
Incorporate your prior beliefs about the dimensions underlying the proposal space using Bayesian methods, which enables them to identify the model, discern the substantive content of the recovered dimensions, assess dimensionality, and check the models validity through vote-specific discrimination parameters. (Jackman 2001)

>> Faster Spatial Voting Model Calibration via EM Algorithms

Consider using Expectation Maximization (EM) algorithms for estimating ideal points in spatial voting models, which can significantly reduce computation time while producing nearly identical results compared to traditional methods such as Markov Chain Monte Carlo (MCMC). (IMAI, LO, and OLMSTED 2016)

> Redistricting Algorithms and Validation Techniques

>> Redistricting Simulation Algorithms and Validation Approaches

Utilize simulation algorithms to generate alternative redistricting plans that conform to federal and state laws, allowing for rigorous evaluation of potential biases in enacted plans through comparison with these alternatives. (McCartan et al. 2022)
Consider using Markov chain Monte Carlo (MCMC) algorithms to simulate redistricting plans, specifically the proposed Swendsen-Wang algorithm extended with simulated tempering and divide-and-conquer approaches, to generate a representative sample of redistricting plans under various constraints. (Fifield et al. 2020)

> Mitigating Bias and Enhancing Validity through Advanced Techniques

>> Electronic Validation & Counterbalancing for Sensitive Topics

Consider using counterbalanced measures to account for acquiescence bias when assessing controversial or sensitive topics, as it can significantly inflate estimates of endorsement and distort correlations between beliefs and individual characteristics. (Hill and Roberts 2023)
Incorporate electronic validation methods using commercial records to improve the accuracy of survey responses, especially for sensitive or socially desirable topics like voting behavior. (Ansolabehere and Hersh 2012)
Incorporate electronic validation methods using commercial records to accurately assess survey responses, especially for sensitive or socially desirable topics, as traditional survey methods can significantly overestimate participation rates due to misreporting. (NA?)

>> Probabilistic Record Linkage for Data Merge Uncertainty Quantification

Employ probabilistic models instead of deterministic algorithms when merging large data sets, as probabilistic models can quantify the uncertainty inherent in the merging process, allowing for better calibration and accounting of false positives and false negatives. (ENAMORADO, FIFIELD, and IMAI 2019)
Prioritize transparency and replicability in your analysis by utilizing open-source tools like fastLink for probabilistic record linkage, which allows for the merging of survey responses with administrative records to improve the accuracy of self-reported turnout rates. (Enamorado and Imai 2019)

> Detecting Election Fraud Using Last Digit Distribution

>> Last digit test for election fraud under specific scope conditions

Be aware of the potential for non-uniformity in the distribution of last digits in election results, which can be an indicator of fraud, but only if certain scope conditions are met, such as avoiding datasets with a preponderance of single- and double-digit counts or clusters of vote counts within a narrow range. (NA?)

>> Incorporating Qualitative Information for Causal Inference

Consider incorporating qualitative information into quantitative analyses to improve causal inference in observational studies, specifically by using qualitative information on outcomes within matched sets to reduce p-values, constructing qualitative confidence intervals on effect size based on additional information across matched sets, and reducing the conservativeness of sensitivity analysis by incorporating qualitative information on unmeasured confounders within matched sets. (Glynn and Ichino 2014)
Consider incorporating qualitative information into quantitative analyses to improve causal inference in observational studies, specifically by using qualitative information on outcomes within matched sets to reduce p-values, constructing qualitative confidence intervals on effect size based on additional information across matched sets, and reducing the conservativeness of sensitivity analysis by incorporating qualitative information on unmeasured confounders within matched sets. (NA?)

Improving Accuracy and Robustness in Quantitative Analysis

> Addressing Bias, Uncertainty, and Model Specification Issues

>> Considerations for Adding Control Variables in Regression Models

Not assume that including additional relevant variables in a regression model will necessarily reduce omitted variable bias, as the impact of doing so depends on various factors such as the relationships among the variables and your variances. (K. A. Clarke 2005)
Exercise caution when adding control variables to a statistical model, as doing so may either increase or decrease the bias of the estimated coefficient of interest, depending on complex interactions among the variables involved. (Pianka, n.d.)

>> Addressing Common Challenges in Causal Inference Models

Use a maximum-likelihood estimator for selection models with dichotomous dependent variables when identical factors affect the selection equation and the equation of interest, as this approach avoids reliance solely on distributional assumptions about the residuals and allows for better identification compared to traditional Heckman-type estimators. (NA?)

>> Avoiding Biases and Misleading Inferences in Causal Estimation

Not rely solely on robust standard errors to address potential model misspecifications, but instead use them as a tool to identify issues and then attempt to resolve those issues through model respecification. (G. King and Roberts 2015)
Focus on making causal inferences rather than searching for a “true” model, as the latter approach leads to loss of leverage over research questions and hinders accurate estimation of causal effects. (G. King 1991b)
Avoid using the Regression on Residuals (ROR) estimator, which involves regressing the residuals from one regression onto another set of predictors, as it leads to biased estimates due to omitted variable bias. Instead, researchers should include all relevant predictors in a single multiple regression model to obtain unbiased estimates. (G. King 1986)

>> Enhancing Estimation Techniques and Reporting Practices

Utilize Bayesian Model Averaging (BMA) to account for model uncertainty, which enables the calculation of posterior distributions over coefficients and models, leading to more robust and reliable estimates compared to traditional frequentist methods. (Hinne et al. 2020)
Prioritize the observed-value approach over the average-case approach when generating predictions from limited dependent variable models, as it produces estimates of the average effect in the population rather than the effect for an artificial “average case”. (Hanmer and Kalkan 2012)
Avoid making strong claims about level-2 effects in multilevel models when the sample size of level-2 units is small and the units may not be representative, and instead consider using visualizations to explore patterns in the data. (Bowers and Drake 2005)
Consider using the beta distribution for modeling dependent variables that are proportions, as it recognizes the inherent relationship between the mean and variance in such data, leading to potentially more accurate and efficient estimates compared to traditional normal-linear models. (Paolino 2001)
Utilize statistical simulation techniques to extract and communicate the most relevant and interpretable information from your statistical results, thereby improving the transparency and accessibility of your findings. (G. King, Tomz, and Wittenberg 2000)
Always report measures of uncertainty, such as standard errors and confidence intervals, alongside your estimates generated by limited dependent variable models to avoid overstating the precision of your findings. (Herron 1999)

> Event Count Data Modeling & Efficient Sampling Techniques

>> Event Count Regression Models: Addressing Assumptions & Enhancing Estimates

Carefully consider the assumptions of independence and homogeneity in event count models, as violations of these assumptions can lead to inefficient estimates and biased standard errors. (G. King 1989b)
Employ event count regression models, which combine the strengths of traditional regression analysis with Poisson process models, to better understand the underlying continuous processes driving international events data. (G. King 1989a)
Avoid using ordinary least squares (OLS) and logged OLS (LOLS) models for event count data due to issues of functional form, heteroskedasticity, and efficiency, and instead opt for the exponential Poisson regression (EPR) model, which provides a better fit for the data and yields more accurate parameter estimates. (G. King 1988)

>> Optimizing Rare Events Studies with Stratified Sampling

Consider using choice-based or endogenous stratified sampling, which involves collecting all available instances of the rare event (e.g., wars) and a small random sample of non-events, to improve the efficiency of data collection and enable the collection of more meaningful explanatory variables while still allowing for valid inferences through appropriate statistical corrections. (G. King and Zeng 2001a)
Consider collecting all available instances of rare events (ones) and a small random sample of non-rare events (zeros) to improve the efficiency of data collection and reduce the underestimation of event probabilities in logistic regression analysis. (G. King and Zeng 2001b)

>> Alternative Tests for Non-Nested Model Selection

Consider using a distribution-free test for nonnested model selection when dealing with highly peaked distributions, as it is asymptotically more efficient than the commonly used Vuong test under these conditions. (K. A. Clarke 2007)
Consider using a nonparametric test for relative model discrimination, specifically the paired sign test, as it outperforms the commonly used Vuong test in situations with small sample sizes and high correlation between models. (K. A. Clarke 2003)

> Causal Inference Challenges & Solutions in Experiments

>> Optimizing Treatment Effect Estimation and Dynamic Interactions

Use a formal two-step framework to estimate heterogeneous treatment effects from randomized experiments and then use this information to optimize policies regarding which treatment should be given to whom, while being mindful of the risks of false discoveries in post hoc subgroup analysis. (Imai and Strauss 2011)

>> Optimizing Randomization Techniques for Valid Causal Inferences

Collect extensive pretreatment information, efficiently randomize treatments using advanced methods like randomized-block and matched-pair designs, accurately record treatment receipt, and carefully account for noncompliance and nonresponse to ensure valid causal inferences in randomized experiments. (Imai and Ning 2023)
Collect extensive pretreatment information, efficiently randomize treatments using advanced methods like randomized-block and matched-pair designs, accurately record treatment receipt, and carefully account for noncompliance and nonresponse to ensure valid causal inferences in randomized experiments. (Horiuchi, Imai, and Taniguchi 2007)

>> Survey Design Pitfalls & Mitigation Strategies for Causal Inference

Carefully consider the four dimensions of external validity (X-, T-, Y-, and C-validity) when conducting causal inference studies, and use appropriate statistical methods to ensure that your findings can be generalized to broader populations, treatments, outcomes, and contexts. (EGAMI and HARTMAN 2022)
Carefully evaluate the assumptions of excludability and ignorability when using the Unexpected Event during Surveys Design (UESD) to estimate causal effects, particularly considering potential threats such as collateral events, simultaneous events, unrelated time trends, and endogenous timing of the event. (Muñoz, Falcó-Gimeno, and Hernandez 2019)
Carefully assess and address the risk of information equivalence (IE) violations in survey experiments, as failure to do so can lead to biased estimates of the causal effect of interest due to changes in subjects beliefs about background features of the scenario induced by the experimental manipulation. (Bansak, Hainmueller, and Yamamoto 2017)
Exercise caution when extrapolating from survey experiments due to potential differences in treatment effects between artificial and real-world settings. (BARABAS and JERIT 2010)

>> Addressing Common Pitfalls in Drawing Causal Conclusions

Avoid dropping subjects based on a manipulation check, as it can lead to biased estimates or hinder the identification of causal effects, unless certain conditions are met, such as the manipulation check being independent of the treatment or the treatment having no effect on the likelihood of passing the manipulation check. (Aronow, Baron, and Pinson 2019)
Carefully evaluate the comparability of treatment and control groups in natural experiments, as random assignment alone does not guarantee validity, and alternative research designs may be necessary to ensure accurate causal inferences. (SEKHON and TITIUNIK 2012)
Exercise caution when using matching techniques to draw causal inferences from observational data, as demonstrated by the authors finding that matching greatly exaggerated the estimated impact of pre-election phone calls on voter turnout compared to an experimental benchmark, and produced implausible results for a separate intervention. (Arceneaux, Gerber, and Green 2010)
Prioritize the careful consideration of the assignment mechanism in causal inference, especially in observational studies, as emphasized by the Neyman-Rubin model, since it is crucial for establishing the conditions necessary for drawing valid causal conclusions. (Sekhon 2009)
Carefully consider and appropriately address potential sources of error in field experiments, including deviations from intended experimental protocols and incomplete randomization, through rigorous statistical methods such as propensity score matching. (NA?)
Carefully evaluate the comparability of treatment and control groups in natural experiments, as random assignment alone does not guarantee validity, and alternative research designs may be necessary to ensure accurate causal inferences. (NA?)

> Addressing Bias and Assumptions in Causal Inference

>> Omitted Variable Bias & Model Misspecification in Multiplicative Interactions

Carefully evaluate the appropriateness of the Linear Interaction Effect (LIE) assumption in multiplicative interaction models, as it often fails in empirical settings, leading to potentially biased and inconsistent estimates. Additionally, researchers should ensure adequate common support in the data to avoid making inferences based solely on interpolation or extrapolation. (Hainmueller, Mummolo, and Xu 2018)
Always include all constitutive terms in your interaction model specifications unless there is a strong theoretical justification and evidence supporting the omission of a term, as failing to do so can lead to biased and inconsistent estimates due to omitted variable bias. (Brambor, Clark, and Golder 2006)
Always include all constitutive terms in your interaction model specifications unless there is a strong theoretical justification and evidence supporting the omission of a term, as failing to do so can lead to biased and inconsistent estimates due to omitted variable bias. (NA?)

>> Addressing Omitted Interaction Bias via Post-Double Selection

Be cautious about adding a single interaction term to a regression model, as this can lead to omitted interaction bias due to unmodeled interactions between the effect modifier and other covariates. Instead, the authors recommend using a post-double selection approach that combines multiple lasso estimators to select interactions for inclusion in the final model, thereby improving stability and reducing bias. (Blackwell and Olson 2021)

>> Addressing Biases from Homogeneous Partial Effects and Temporal Dynamics

Ensure that the variation in the endogenous regressor related to the instrumental variable has the same causal effect as variation unrelated to the instrument, as failing to meet this assumption of homogeneous partial effects can lead to biased estimates when using instrumental variables regression. (Dunning 2008)

>> Explicit Identification Strategies for Rigorous Causal Inference

Use causal graphs to explicitly define your identification strategy and delineate which effects are presumed to be identified, thereby ensuring that only identified effects are assigned causal interpretations. (L. Keele, Stevenson, and Elwert 2019)
Adopt an explicit identification strategy, which includes making clear assumptions about the causal relationships in your study, in order to address the inherent identification problem in causal inference. (L. Keele 2015)
Prioritize causal identification through experimental or natural experimental designs, leveraging the identifying power of such designs to establish specific causal facts for well-defined subpopulations, rather than relying solely on traditional regression studies with informally motivated control variables. (NA?)

>> Strengthening Instrument Variables (IV) Estimates

Carefully evaluate the strength of your instruments, account for potential sources of non-i.i.d. error structures, and acknowledge the high degree of uncertainty associated with IV estimates, especially when comparing them to OLS estimates. (Lal et al. 2023)
Carefully evaluate the strength of your instruments, consider alternative inferential methods beyond traditional t-tests, and critically examine the differences between 2SLS and OLS estimates to ensure compliance with the exclusion restriction assumption. (Kang et al. 2020)

>> Addressing Common Sources of Bias in Causal Estimation

Utilize the Imbens-Angrist instrumental variable model along with the monotone treatment response assumption to identify the joint distribution of potential outcomes among compliers in order to determine the percentage of persuaded individuals and your statistical properties. (Fu, Narasimhan, and Boyd 2020)
Avoid conditioning on post-treatment variables in experiments, as doing so can create biases in estimated treatment effects by introducing imbalances in other confounding variables that were initially balanced through randomization. (Montgomery, Nyhan, and Torres 2018)
Be aware of the potential for coarsening bias when using instrumental variable (IV) estimation with coarsened treatment measures, as failing to account for this bias can lead to inconsistent estimates of treatment effects and potentially misleading policy recommendations. (Marshall 2016)
Be aware of and account for differential measurement error, which occurs when measurement errors are not independent of the outcome variable, as it can lead to biased estimates of causal effects and incorrect conclusions. (Imai and Yamamoto 2010)

>> Addressing Confounding and Bias Amplification in Model Selection

Conduct sensitivity analyses to evaluate the robustness of your findings to potential unobserved confounding, specifically by estimating the magnitude of the association between the unobserved confounder and the treatment and outcome necessary to alter the conclusions drawn from the statistical analysis. (Cinelli and Hazlett 2019)
Carefully evaluate the potential for bias amplification when including covariates in your statistical models, especially when using fixed effects for groups, as these can act as pure bias amplifiers and potentially increase bias rather than reducing it. (Middleton et al. 2016)
Consider the potential impact of unmeasured confounding on your causal estimates by varying the confounding function, which quantifies the extent of selection bias, and assessing the robustness of your results to different levels of unmeasured confounding. (NA?)
Carefully evaluate the potential for bias amplification when including covariates in your statistical models, especially when using fixed effects for groups, as these can act as pure bias amplifiers and potentially increase bias rather than reducing it. (NA?)

>> Conjoint Analysis, Interaction Effects, and Distribution Choice

Carefully consider the distribution of profile attributes when conducting conjoint analysis, as the choice of distribution can significantly impact the external validity of results, particularly when there are interactions between attributes or when the target profile distribution deviates from uniform. (Cuesta, Egami, and Imai 2021)
Avoid drawing conclusions about subgroup differences in preferences solely based on differences in conditional AMCEs (average marginal component effects), as these differences can be misleading due to the sensitivity of interaction terms to the reference category used in regression analysis. (Leeper, Hobolt, and Tilley 2019)
Consider using conjoint analysis, a type of experimental design that enables the simultaneous estimation of the causal effects of multiple treatment components, allowing for comparisons on the same scale and improving internal validity compared to more model-dependent procedures. (Hainmueller, Hopkins, and Yamamoto 2014)

>> Factorial Experiments for Estimating Interaction Effects

Consider using factorial experiments to investigate causal interactions, which allows for the estimation of average marginal interaction effects (AMIEs) that are invariant to the choice of baseline condition and enable effect decomposition and regularization through ANOVA. (Egami and Imai 2018)

> Regression Discontinuity Design: Assumptions, Limitations, and Solutions

>> Regression Discontinuity Design: Validating Assumptions and Enhancing Inferences

Carefully choose between the continuity and local randomization frameworks for analyzing regression discontinuity designs, considering factors such as the nature of the data, the presence of continuity in the regression functions, and the plausibility of the exclusion restriction assumption. (Arai, Otsu, and Seo 2021)
Leverage knowledge about exogenous noise in the running variable to complement traditional continuity-based analyses in regression discontinuity designs, allowing for more robust causal inferences and improved identification of policy-relevant estimands. (Eckles et al. 2020)
Carefully verify the presence of a discontinuous change in the probability of treatment assignment at the known threshold, as this is a crucial requirement for implementing a Regression Discontinuity (RD) design. (Cattaneo, Idrobo, and Titiunik 2019)
Carefully examine the validity of the continuity assumption in regression discontinuity designs, as it plays a crucial role in ensuring that the estimated treatment effect accurately represents the true causal impact of the intervention. (Imbens and Lemieux 2007)

>> Difference-in-Discontinuities Approach for Geographic RDD

Utilize the difference-in-discontinuities approach in geographic regression discontinuity designs to account for time-invariant sorting and other policy changes at the border, allowing for more robust estimates of treatment effects. (Butts 2021)

>> Regression Discontinuity Design: Balance and Continuity Assumption

Carefully consider and test the assumption of covariate balance in regression discontinuity designs, as violations of this assumption can lead to biased estimates. (Caughey and Sekhon 2011)
Carefully distinguish between the continuity assumption and the local randomization assumption when implementing the regression discontinuity design, as the latter is more stringent and can lead to incorrect inferences if incorrectly invoked. (NA?)

>> Regression Discontinuity Design: Bias-Variance Trade-offs and Overestimation Risks

Be aware of the inherent bias-variance trade-off in regression discontinuity (RD) designs, which arises due to the lack of observations at the cutoff and necessitates careful consideration of the choice of bandwidth when estimating causal effects. (Stommes, Aronow, and Sävje 2023)
Be cautious when interpreting results obtained via Regression Discontinuity Design (RDD) due to the risk of overestimation caused by inappropriate statistical procedures, low statistical power, and potential publication bias. (Stommes, Aronow, and Sävje 2021)
Be aware of the inherent bias-variance trade-off in regression discontinuity (RD) designs, which arises due to the lack of observations at the cutoff and necessitates careful consideration of the choice of bandwidth when estimating causal effects. (NA?)

>> High-Degree Polynomial Control Risks in RDD

Avoid using high-order polynomial regressions in regression discontinuity designs due to your susceptibility to misleading estimates caused by noisy weights, sensitivity to the degree of the polynomial, and failure to achieve nominal coverage in inferences. (Gelman and Imbens 2018)
Exercise caution when using high-degree polynomial controls in regression discontinuity designs due to the risk of overfitting, leading to biased estimates and inflated Type I errors. (NA?)

> Mitigating Bias in Sensitive Survey Questions

>> Randomized Response Techniques for Evasion Reduction

Consider using randomized response techniques to mitigate evasive answer biases in surveys, particularly for sensitive topics, as it allows respondents to maintain privacy and provide truthful answers without disclosing sensitive information. (Warner 1965)

>> Enhancing Data Quality and Truthfulness in Subjective Measurement

Consider the trade-off between data quantity and data quality when making population inferences with big data, specifically focusing on the data defect correlation (?R,G) as a measure of data quality, rather than solely relying on probabilistic uncertainty assessments. (Meng 2018)
Employ an information-scoring system that assigns high scores to answers that are more common than collectively predicted, rather than relying on consensus-based methods, in order to elicit truthful subjective data when objective truth is unknowable. (Prelec 2004)

>> Accounting for Ambiguity in Decision Making Models

Carefully consider the impact of ambiguity, defined as uncertainty about unknown probabilities, on decision-making and account for it in your statistical models using appropriate measures such as source functions and ambiguity premiums. (Abdellaoui et al. 2011)

>> Sensitive Traits Estimation via Indirect Questioning Techniques

Carefully consider the potential impact of nonstrategic measurement error, such as careless responding or administrative errors, on the validity of your list experiments, and utilize robust statistical methods, such as the proposed uniform error model, to account for such errors. (Blair, Chou, and Imai 2019)
Incorporate auxiliary information, such as county-level election results, as additional moment conditions in the statistical analysis of indirect questioning techniques like the list experiment and randomized response technique, or as part of the prior distribution in the Bayesian hierarchical measurement model for the endorsement experiment, in order to improve the precision and accuracy of estimates of sensitive traits at lower levels of aggregation. (Chou, Imai, and Rosenfeld 2017)
Prioritize conducting validation studies to compare estimates of sensitive traits obtained through various survey methodologies against the corresponding truth, as demonstrated in this study using the official election outcome for a sensitive anti-abortion referendum. (Rosenfeld, Imai, and Shapiro 2015)
Utilize the randomized response technique to mitigate biases associated with sensitive questions, particularly when employing multivariate regression analysis, and carefully select appropriate designs based on assumptions about respondent compliance and knowledge of the randomization distribution. (Blair, Imai, and Zhou 2015)
Utilize a maximum likelihood estimator instead of a naive two-step estimator when incorporating predicted responses from list experiments into regression models, as it provides greater statistical efficiency and flexibility in handling different types of models. (Imai, Park, and Greene 2015)
Carefully consider the assumptions of the no design effect and no liars when implementing list experiments, as failure to meet these assumptions could lead to biased estimates of the population proportion of individuals giving affirmative answers to sensitive items. (Blair and Imai 2012)
Enable multivariate regression analysis for the item count technique to accurately estimate the relationship between respondents characteristics and your answers to sensitive questions, while maximizing statistical efficiency. (Imai 2011)
Employ the List Experiment (LISTIT) methodology, which uses a combination of control and treatment groups to accurately estimate the prevalence of sensitive behaviors or opinions, while protecting respondent anonymity and avoiding response bias. (Corstange 2009)

>> Combining Multiple Measurement Models for Enhanced Estimates

Validate measurements of sensitive concepts using multiple survey instruments, such as list and endorsement experiments, and compare your results through statistical testing and multivariate regression models to ensure the accuracy and credibility of your findings. (Blair, Imai, and Lyall 2014)
Utilize a Bayesian hierarchical measurement model for endorsement experiments, which enables the combination of responses to multiple policy questions in a principled manner and produces efficient estimates of support levels for multiple political actors at both individual and aggregate levels. (Bullock, Imai, and Shapiro 2011)

> Enhance Cross-Cultural Comparability with Advanced Survey Techniques

>> Leveraging Anchoring Vignettes for Response Validation

Administer self-assessment questions immediately after anchoring vignettes to leverage priming effects and improve the accuracy of responses. (Hopkins and King 2010)
Use anchoring vignettes, which are supplemental survey questions that provide a common reference point for respondents using different standards for the same scale, to correct for response-category differential item functioning (DIF) and improve the comparability of survey responses. (G. King and Wand 2007)
Consider incorporating vignettes in your survey designs to directly measure and correct for interpersonal incomparability in responses, thereby enhancing the validity and cross-cultural comparability of your measurements. (KING et al. 2004)

>> Robustness Check via Multiverse Analysis Across Data Processing Choices

Conduct a “multiverse analysis” to examine the robustness of your findings across different reasonable data processing choices, rather than relying solely on a single data set analysis that may be influenced by arbitrary processing decisions. (Steegen et al. 2016)

>> Measurement Invariance for Valid Cross-Cultural Construct Comparison

Establish measurement invariance through confirmatory factor analysis before comparing constructs across cultures, as failure to do so could lead to misleading conclusions due to differences in interpretation rather than true differences in the constructs being studied. (Davidov 2009)

>> Measurement Differences Impact Assessment using EPC-Interest

Evaluate the Expected Parameter Change in Interests (EPC-Interest) to determine the potential impact of measurement differences on substantive comparisons, instead of solely focusing on achieving measurement invariance across groups. (Oberski 2014)
Evaluate the Expected Parameter Change in Interests (EPC-interest) to determine the potential impact of measurement differences on substantive comparisons, instead of solely focusing on achieving measurement invariance. (NA?)

> Enhancing Validity and Clarity in Measurements and Communication

>> Systematically Studying Individual Differences in Judgment & Decision Making

Adopt a more systematic approach to studying individual differences in judgment and decision-making (JDM) by selecting theoretically relevant measures, focusing on interactions between individual differences and decision features, situational factors, and other individual differences, and communicating all results extensively. (Appelt et al. 2011)

>> Assuring Data Quality through Reliability and Validity

Adopt a comprehensive approach to data quality assessment, focusing on content validity, data generation process validity and reliability, and convergent validity, using a combination of qualitative and quantitative methods to ensure the appropriateness of measures for your research questions and cases. (McMann et al. 2021)

Enhancing Data Analysis Techniques for Environmental Science

> Enhanced Soil Analysis with Machine Learning Algorithms

>> Improving Soil Moisture Estimation through Vegetation Heterogeneity

Consider the heterogeneity of vegetation coverage within satellite sensor footprints when developing algorithms for estimating soil moisture, as this can significantly improve the accuracy of your estimates. (NA?)

> Improving Remote Sensing Data Quality through Advanced Analytics

>> Optimizing Satellite-Based Vegetation Measurements via Statistics

Carefully consider the impact of sensor calibration, atmospheric correction, and solar zenith angle on the accuracy of satellite-based vegetation index measurements, particularly when comparing data across multiple sensors and time periods. (Tucker et al. 2005)

>> Advanced Spatial-Temporal Interpolation Methods for Large Datasets

Employ a three-dimensional discrete cosine transform-based penalized least squares approach to address data gaps in large spatiotemporal datasets, as it effectively leverages both spatial and temporal information while avoiding overfitting and reducing the impact of smoothing on high-frequency components. (G. Wang et al. 2012)
Use an enhanced ecosystem-dependent interpolation (EEDI) algorithm when dealing with heterogeneous landscapes, as it improves upon traditional interpolation methods by incorporating multiple reference LAI time series, selecting only very strong phenological links, and employing an iterative process to update spatial-temporal constraint information. (NA?)

>> Deep Learning Gap-Filling for Time Series Data

Consider utilizing deep learning techniques for gap-filling in time series data, as demonstrated by the authors successful implementation of a deep learning method for gap-filling in eddy covariance crop evapotranspiration data, which resulted in a significant reduction in normalized root mean square error compared to traditional methods. (NA?)

> Improving Image Classification and Species Distribution Models

>> Improved Metrics & ANNs for Land Cover Mapping

Consider using the newly introduced Classification Success Index (CSI), Individual Classification Success Index (ICSI), and Group Classification Success Index (GCSI) to evaluate the overall, individual, and group accuracy of classified images representing semi-natural woodland environments, respectively, as these indices offer significant improvements over traditional approaches by taking into account both errors of omission and commission. (Radwan et al. 2019)

> Enhanced Spatial & Temporal Analytics for Climate Research

>> High Resolution Sea Ice Database Development

Prioritize creating comprehensive databases with high spatial and temporal resolution, such as the new sea-ice database described here, which uses digitized weekly charts from multiple sources and covers both the Arctic and Antarctic for a ten-year period. (Knight 1984)

>> Combining Remote Sensing Techniques for Sea Ice Monitoring

Consider combining multiple remote sensing techniques, such as passive microwave and thermal infrared imagery, to improve the accuracy and resolution of sea-ice concentration estimates, particularly for small features like leads and polynyas. (NA?)

>> Addressing Bias and Error through Rigorous Uncertainty Estimation

Carefully consider and address potential sources of bias and error in your data, including those related to measurement methods, platform differences, and data gaps, and estimate the resulting uncertainties using rigorous statistical methods such as Monte Carlo simulations and variance partitioning. (NA?)

>> Bayesian Low-Rank Semi-Parametric Modeling for Large Datasets

Consider using a low-rank semiparametric Bayesian spatial mixed-effects linear model to handle large, highly nonstationary spatiotemporal datasets, as it combines the advantages of low-rank approximations for computational efficiency and semiparametric modeling for flexibility in capturing complex dependencies. (Hazra and Huser 2021)

>> Improving Ocean Surface Flux Estimation with Satellite Data

Utilize advanced algorithms and multi-satellite data to improve the accuracy of estimating oceanic surface fluxes, especially for latent heat flux, which has historically suffered from low accuracy due to poor surface air specific humidity estimations. (NA?)

>> Improving Hydrologic & Atmospheric Modeling via Novel Calibrations

Consider using a simple water balance model (SWBM) with a novel calibration approach involving multiple random parameter sets to generate a hydrological dataset for Europe, as it outperforms several state-of-the-art datasets for soil moisture dynamics and provides a reasonable alternative to sparse measurements or proxy data. (NA?)
Consider using a combination of high-resolution regional climate models like COSMO-CLM with coarser global reanalysis data like ERA-Interim, along with techniques such as spectral nudging, to accurately capture fine-scale spatial patterns while maintaining consistency with large-scale atmospheric circulation patterns. (NA?)

> Statistical Approaches for Climate Modeling and Forecasting

>> Statistical methods refine climate predictions via data integration

Consider using a spectral approach when modeling General Circulation Models (GCMs) output, as it leads to more interpretable coefficients, improved fits, and reduced computational costs for parameter estimation. (Castruccio and Stein 2013)
Utilize a hierarchical statistical model to combine the output from simple ensembles of regional climate models (RCMs) in order to accurately characterize the distribution of the model output and make probabilistic projections of regional climate change. (Sain, Furrer, and Cressie 2011)
Account for the spatial dependence of individual climate models and cross-covariance across different climate models when analyzing multiple climate model errors, which can be achieved through the use of a joint statistical model with a nonseparable cross-covariance structure. (Sang, Jun, and Huang 2011)
Account for the varying quality and consistency of data sources when aggregating temperature data, as demonstrated by the proposed mathematical framework that weights data points based on your reliability and compatibility within a spatial network. (NA?)

>> Statistical Models for Spatial Scaling Issues in Climate Research

Consider using max-stable processes for modeling spatial dependencies in extreme events, as they provide a flexible framework for incorporating spatial dependence structures that are consistent with classical extreme value theory. (Blanchet and Davison 2011)
Carefully consider the spatial scaling issues involved in comparing gridded climate model outputs with point-level weather station data, and develop appropriate statistical models to address these challenges. (Mannshardt-Shamseldin et al. 2010)

> Optimal Research Design for Climate Change and Agriculture

>> Improving Predictive Models through Advanced Analytic Approaches

Aim to approximate the ideal climate change experiment as closely as possible by analyzing longer-term changes in climate rather than solely focusing on short-term variations in weather, as this allows for a more accurate assessment of adaptation efforts and your impact on economic outcomes. (Burke and Emerick 2016)
Consider using multiple gridded weather datasets when developing statistical crop yield models in absence of information about the most reliable gridded weather dataset, as weather dataset choice can have important implications for robustness of conclusions drawn about climatic drivers of yield variability. (NA?)

Enhancing Research Design through Advanced Statistics

> Statistical Techniques Boost Analysis in Biology and Chemistry

>> Statistically Analyzed Domain Swaps Impact Glycosyltransferase Regiospecificity

Employ domain-swapping strategies and high-performance liquid chromatography (HPLC)-based kinetic analysis to investigate the basis of differing regiospecificity among closely related glycosyltransferases (GTs), which enables them to identify individual amino acids responsible for observed shifts in specificity and assess your impact on reigospecificity towards various flavonoid substrates. (NA?)

>> Improving Genomic Selection Models with Fixed-Effect Markers

Consider incorporating fixed-effect markers from genome-wide association studies (GWAS) into genomic selection (GS) models to potentially improve prediction accuracy for complex traits like capsaicinoid content in Capsicum annuum. (G. W. Kim et al. 2022)

> Statistical Solutions for Biodiversity Conservation and Agriculture

>> Statistical Approaches for Assessing Invasive Species Impact

Aim to integrate data from various sources, such as the Global Invasive Species Database (GISD) and DAISIE, by adopting a standardized pathway categorization scheme, such as the one recommended by the Convention on Biological Diversity (CBD), to improve data quality, increase sample size, and enable cross-taxonomic and cross-spatial analyses. (Saul et al. 2016)
Carefully consider the impact of introducing non-native species on the functional diversity and network modularity of symbiotic relationships, as evidenced by the finding that introduced trees in New Zealand show a significantly lower functional diversity of fungal hyphal foraging strategies and a simpler network structure compared to native trees. (NA?)

> Statistical Integration of Molecular & Morphological Data for Taxonomy

>> Statistically informed fungal classification integrating molecular & morphology

Use multiple lines of evidence, including molecular data, when classifying and identifying fungal taxa, especially those with complex histories and conflicting previous classifications, such as Bactrodesmium. (NA?)
Use multiple types of evidence, such as morphological and molecular data, when attempting to distinguish between closely related species or genera, as demonstrated by the finding that the genera Xylodon and Schizophora cannot be reliably distinguished using either type of data alone. (NA?)

> Statistics Applied to Microbiology and Methane Metabolism Studies

>> Statistically Optimizing Inhibitor Use for Microbe-Methane Interactions

Consider the impact of specific substrates, particularly quaternary amines like trimethylamine, choline, and glycine betaine, on microbial communities in marine sediments, as your presence can influence the balance between sulfate reduction and methanogenesis, leading to increased methane production. (Gary M. King 1984)
Carefully consider the use of selective inhibitors such as sodium molybdate and 2-bromoethanesulfonic acid to differentiate the effects of different microorganisms on substrate metabolism, as demonstrated by the finding that trimethylamine (TMA) was a significant source of methane (35 to 61%) in intertidal sediments, while acetate and methanol were primarily catabolized by (Gary M. King, Klug, and Lovley 1983)

>> Statistics Optimize Environmental Factors Measurement in Methane Studies

Consider the importance of oxygen availability as a limiting factor in studying root-associated methanotrophy in aquatic plants, as evidenced by the strong correlation between maximum potential uptake rates (Vmaxp) and ambient temperature for Calamagrostis canadensis, and the lack of postanoxia aerobic methanosumption in roots incubated under anoxic conditions. (G. M. King 1994)
Consider the impact of varying methane and nitrogen concentrations on the inhibition of methane consumption by ammonium and nitrite in pure cultures of two methanotrophs, as the extent of inhibition varies nonlinearly as a function of these factors, leading to different outcomes depending on the specific concentrations used. (Gary M. King and Schnell 1994)
Carefully consider the impact of environmental variables, such as soil moisture and nutrient availability, on your experimental outcomes, as demonstrated by the finding that methane uptake was optimal in soils with a water content of 20 to 30% (grams per gdw) and was significantly affected by the presence of ammonium and nitrite. (Schnell and King 1994)
Carefully control and measure environmental factors such as light exposure, oxygen availability, and sediment type when studying methane oxidation rates in wetland ecosystems, as these variables can greatly impact the accuracy and interpretation of your findings. (Gary M. King, Roslev, and Skovgaard 1990)

> Statistical Techniques Boost Microbial Analysis & Oceanography Studies

>> Improved Phospholipid Analysis for Sediment Microbe Biomass Estimation

Consider using a modified phospholipid analysis method for estimating microbial biomass in sediments, which offers benefits including increased sensitivity due to the use of malachite green dye, enhanced simplicity and safety via a persulfate oxidation technique, high accuracy and precision, and applicability to a broad range of sediment types. (Findlay, King, and Watling 1989)

>> Denaturing Gradient Gel Electrophoresis (DGGE) in Marine Bacteria Analysis

Consider using denaturing gradient gel electrophoresis (DGGE) as a reliable and reproducible fingerprinting technique for comparing whole bacterial assemblages in marine environments, particularly when studying temporal and spatial dynamics. (Schauer, Massana, and PedrÃ³s-AliÃ³ 2000)

>> Integrating Genomic Data with Environment Factors for Marine Bacteria

Consider the importance of analyzing genomic data alongside environmental factors to better understand the adaptive strategies of marine bacteria in your natural habitats. (Moran et al. 2004)

>> Statistical Best Practices for Robust Microbiome Data Interpretation

Consider collecting biological duplicates when studying microbial communities, as this approach allows for the simultaneous assessment of both technical and environmental sources of variation. (Rohwer and McMahon 2022)
Be aware of the potential impact of analytical pipeline choice on the interpretation of microbiome data, particularly when comparing results across studies or sites. (Gary M. King et al. 2012)

> Statistical Analysis for Conservation Efforts in Marine Biology

>> Statistically Assessing Vessel Types

Carefully consider the potential impact of different vessel types on non-indigenous marine species (NIMS) introductions, as container ships, which are typically considered low risk and have shorter port stays, may pose less of a threat compared to other vessel types like tugboats, barges, dredgers, and drilling rigs, which tend to stay in port longer and have a higher likelihood of biofouling. (NA?)

Enhancing Data Analysis Techniques Across Disciplines

> Enhance Remote Sensing & Urban Studies via Statistics

>> Statistical Morphometry of Building Footprints Aids Urban Planning

Consider using the open-source R package foot to calculate morphology metrics for building footprints and summarize them in various spatial scales and representations, including gridded (or raster) formats, to gain insights about urban development, identify areas prone to natural hazards, and estimate population distribution in the absence of traditional data sources. (Jochem and Tatem 2021)

> Optimizing Spatial Data Analysis & Machine Learning Models

>> Improving Conflict Studies with Better Spatial Data

Consider using a fast spatial multiple imputation procedure to handle known geographically imprecise (KGI) observations in large, geographically disaggregated data sets on armed conflict, as this can improve out-of-sample predictive performance compared to simply excluding such observations. (Salehyan et al. 2012)
Carefully consider the resolution and boundaries of your spatial data grids, as well as the appropriateness of your chosen units of analysis, to ensure accurate and meaningful results in conflict studies. (NA?)

>> Optimal Algorithm Selection & Ensemble Methods

Carefully consider the choice of machine learning algorithm for your specific problem, taking into account the trade-offs between accuracy, interpretability, and computational efficiency, and potentially employ ensemble methods like random forests to improve performance. (NA?)

>> Improving Spatial Data Quality & Remote Sensing Applications

Carefully consider the fitness for use of gridded population data products, taking into account factors such as spatial, thematic, and temporal accuracy, as well as the nature and heterogeneity of the input population data, the use and characteristics of ancillary data involved, and the methodological framework applied to redistribute population counts to grid cells. (Leyk et al. 2019)

> Improving Spatial Analysis & Data Harmonization

>> Datum Selection Impact on Coordinate Consistency in Spatial Analysis

Carefully consider the choice of datum when conducting spatial analysis, as different datums can lead to variations in horizontal and vertical coordinates of up to several centimeters, and the most direct transformation path with the latest transformation parameters should be used to obtain the highest and most consistent coordinate quality. (NA?)

>> Semantic Web Technologies for Road Network Data Integration

Consider employing Semantic Web Technologies, specifically RDF/Turtle ontologies and semantic rules, to facilitate data harmonization across disparate road network datasets, allowing for improved data integration and coordination among transport agencies in Australia and New Zealand. (NA?)

> Enhanced Mobility Research through Advanced Analytics

>> Urban Perception Prediction via Convolutional Neural Networks

Consider using convolutional neural networks (CNNs) trained on large, globally diverse, and crowdsourced datasets to make predictions about urban perception, as demonstrated by the success of the Streetscore-CNN and Ranking SS-CNN models in predicting pairwise comparisons of urban appearance. (Dubey et al. 2016)

>> Improving Urban Mobility Studies with Realistic Datasets

Prioritize the use of comprehensive and realistic mobility datasets when evaluating vehicular network performance, as incomplete representations of vehicular mobility can lead to over-optimistic estimates of network connectivity and protocol performance. (Uppoor et al. 2014)

>> Leveraging Telecommunication & Big Mobility Data for Population Insight

Consider using telecommunications data in conjunction with census data and satellite images to create high-resolution population estimates in time and space, particularly if they want to track population dynamics in real-time or between censuses. (Douglass et al. 2015)

>> Improving Estimates of Ridesharing Impact with Granular Data

Avoid using binary indicators for market entry as a proxy for rideshare activity, as it explains very little variation in actual usage, and instead utilize granular data on rideshare activity to obtain more accurate estimates of its impact on traffic fatalities. (Anderson and Davis 2021)

> Considering Confounders in Cross-Sectional Studies

>> Covid-19 Case Counts: Confounding Factors in Suburban Areas

Be cautious when interpreting cross-sectional data on COVID-19 case counts across suburbs without considering potential confounding factors such as population density, age distribution, socioeconomic status, and access to healthcare services. (“This COVID-19 Sleuth Is Making Friends and Foes Advocating for African Science” 2022)

Optimizing Data Analysis Techniques for Research

> Depth-First Search Enhances Frequent Subgraph Mining

>> Depth-First Search Lexicographic Ordering Reduces False Candidates

Consider using depth-first search with DFS lexicographic ordering when mining frequent subgraphs, as it enables efficient discovery of frequent subgraphs while eliminating redundancy and reducing the need for expensive false candidate testing. (Yan and Han, n.d.)

> Bayesian vs Frequentist Statistics & Significance Thresholds

>> Bayesian Alternatives to Classical Tests with R Package

Consider implementing Bayesian alternatives to classical statistical tests through the use of the BayesianFirstAid package in R, which provides user-friendly functionality for conducting Bayesian analyses while allowing for comparison between Bayesian and frequentist results. (Roubik 2002)

> Optimal Threshold Estimation for Record Linkage

>> Optimal threshold estimation for record linkage using statistical methods

Carefully consider your choice of comparison space and use statistical methods to estimate optimal thresholds for determining matches and non-matches when linking records across multiple data sources. (Fellegi and Sunter 1969)

> Causal Inference with Directed Acyclic Graphs

>> Identifying Types of Effect Modification using DAGs

Utilize directed acyclic graphs (DAGs) to identify and categorize effect modification in causal relationships, distinguishing between direct effect modification, indirect effect modification, effect modification by proxy, and effect modification by a common cause. (VanderWeele and Robins 2007)

> Optimal MCMC Simulation Length for Better Estimation

>> Longer MCMC Simulations Yield Superior Parameter Estimates

Prioritize running longer Markov Chain Monte Carlo (MCMC) simulations instead of relying on numerous shorter ones, as a single long chain provides better estimates than an aggregation of several short chains, even when accounting for burn-in periods. (Margossian and Gelman 2023)

> Promoting Scientific Integrity to Ensure Valid Results

>> Preventing Misconduct to Uphold Research Reliability

Prioritize integrity in all stages of the scientific process, as even seemingly minor instances of misconduct such as redundant publication, failure to disclose conflicts of interest, or fabrication of data can compromise the validity and reliability of findings. (Williams 1997)

> Addressing Challenges in Causal Inference & Model Selection

>> Assumptions in Causal Inference with Observational Data

Carefully consider your assumptions about causality when using statistical methods to draw conclusions from observational data. (Jensen et al. 2009)

> Beyond p-values: Reporting Descriptives & Confidence Intervals

>> Reporting Descriptive Statistics and Confidence Intervals

Not rely solely on p-values as a measure of statistical significance, but instead report descriptive statistics along with confidence intervals to provide a more comprehensive understanding of your results. (Pripp 2015)

> Enhancing Reproducibility through Organized Project Directories

>> Implementing Folder Structures for Reproducible Collaboration

Carefully manage your project directories using specific folders (/code, /data, /results) and relative file paths to ensure reproducibility and efficient collaboration. (Trisovic et al. 2022)

> Natural Language Processing for Event-Location Relationship Extraction

>> Extracting Event-Location Relationships via NLP

Carefully consider using natural language processing techniques to extract event-location relationships from text data when traditional methods may be insufficient or unavailable. (Halterman 2019)

> Causal Inference vs Correlation Consideration in Model Assumptions

>> Addressing Confounders in Distinguishing Correlation from Causality

Carefully distinguish between correlation and causality in your analyses, paying close attention to underlying modeling assumptions and potential confounding variables. (Kovářík, Levin, and Wang 2016)
Carefully distinguish between correlation and causality in your analyses, paying close attention to underlying modeling assumptions and potential confounding variables. (NA?)

Enhancing Research Design & Analysis Techniques Across Disciplines

> Machine Learning Applications in Wildlife Monitoring & Farming

>> Combining Object Detection Models For Posture Classification

Consider separating the tasks of object detection and posture classification into distinct models, as demonstrated by the success of combining YOLOv5 for pig detection and EfficientNet for pig posture classification in improving accuracy for pig posture classification in precision livestock farming. (L.-W. Chen, Watanabe, and Rudnicky 2023)

> Implementing Complete Linear Models in Sheep Genetics Evaluation

>> Complete Linear Model Boosts Ultrasound Carcass Trait Predictions

Consider implementing a complete linear model instead of using pre-adjustment factors for ultrasound scanned carcass traits in large scale sheep genetic evaluations, as it provides significant improvements in regression slopes and is therefore likely to yield more accurate predictions of future progeny performance. (Brito et al. 2017)

> Genetics Studies: Refining Data Interpretation & Mutation Rates

>> Mutation Rate Differences in Mitochondrial vs Nuclear DNA

Consider the potential impact of mutagenic compounds generated within the mitochondrion, such as free radicals, on the rate of substitutions in mtDNA, as this could differ from the rate of mutations caused by external factors or affecting nuclear DNA. (Brown 1981)

> Improving Data Collection & Interpretation Through Best Practices

>> Optimizing Animal Behavior Studies with Appropriate Controls

Carefully consider the use of appropriate controls in animal behavior studies, such as presenting novel objects alongside dead conspecifics, to distinguish specific responses to relevant stimuli from general responses to novelty. (Iglesias, McElreath, and Patricelli 2012)

>> Optimizing Brain Tissue Extraction Protocols for Safe Preservation

Prioritize rapid yet safe brain extraction in the field, adhering to strict biosafety protocols to prevent the spread of potential pathogens while ensuring high-quality tissue preservation for subsequent analysis. (Gräßle et al. 2023)

> Integration of Machine Learning & Multidisciplinary Approaches

>> Machine Learning Impact on Cultural Evolution Processes

Consider the impact of intelligent machines on all three Darwinian properties of culture - variation, transmission, and selection - as they are likely to have transformative impacts on cultural evolution. (Brinkmann et al. 2023)

> Evolving Perspectives in Causal Inference & Multidisciplinary Studies

>> Integrating Adaptation Theory for Comprehensive Behavioral Analysis

Adopt an integrative approach to studying behavior, incorporating both proximate and ultimate causes using Tinbergens Four Questions framework, in order to better understand the complex feedback loops between behavior, ecology, and evolution. (Sih et al. 2010)
Carefully distinguish between adaptive and non-adaptive sources of individual differences when studying personality traits, using empirical methods to test hypotheses about the origins of these differences. (Wilson 1998)

>> Improved Modeling for Cooperation Dynamics & Geographic Mosaic Theory

Consider incorporating variable population sizes, environmental costs of living, and dynamic socio-spatial structures in your models to accurately capture the complexity of real-world scenarios and provide insights into the evolution of cooperation in harsh environments. (Smaldino, Schank, and McElreath 2013)

>> Integrating Theory and Practice in Model Organism Selection

Recognize the value of theory-based approaches alongside practical experiments, particularly in situations where traditional lab or fieldwork is not feasible or sufficient, as these methods can provide unique insights and drive scientific progress. (Hallsworth et al. 2023)

>> Bayesian Updating Policies

Consider the evolutionary implications of assuming that expected reproductive output is the sole determinant of evolutionary success, as it suggests that organisms using Bayesian updating policies for making decisions based on environmental signals will have a selective advantage. (Okasha 2013)

>> Cue Reliability

Consider the impact of cue reliability on developmental trajectories, as repeated exposure to moderately reliable cues can lead to more gradual changes in phenotypic traits and greater variation among individuals compared to exposure to highly reliable cues. (NA?)

> Behavioral Economics: Feedback Loops & Overconfidence

>> Negative Feedback

Consider the distinction between immediate and future reproductive effects of risky behavior on asset accumulation, as well as the possibility of repeated decision making over an organisms lifespan, when studying the role of negative feedback in shaping animal personalities. (McElreath et al. 2007)

Enhancing Data Collection and Analysis Techniques

> Improving Hand Gesture Recognition Through Novel Approaches

>> Optimizing Gesture Dataset Collection & Interpretation

Consider using a semi-structured elicitation procedure, such as a gameful approach, to obtain more naturalistic results when collecting gesture datasets. (NA?)

>> One-Shot Learning with Curated Datasets

Consider collecting and utilizing smaller, carefully curated datasets specifically designed for one-shot learning tasks, rather than relying solely on larger, more traditional datasets. (NA?)

> Improving Human Activity Recognition through Better Datasets

>> Improving Machine Learning Models for Specialized Human Activities

Prioritize creating realistic and comprehensive datasets for human activity recognition in specialized fields like construction, as these datasets enable accurate and nuanced analyses of worker activities and safety risks. (Mäkela et al. 2021)

>> Optimizing Sensor Selection and Placement for Activity Recognition

Consider collecting and sharing diverse datasets, such as the MyoGym dataset, which contains 6D motion signals and 8 channel electromyogram data from 10 individuals performing 30 different gym exercises, to facilitate the development and evaluation of activity recognition algorithms. (NA?)

>> Smartphone-based HAR: Optimizing Realism and Transition Handling

Strive to collect data in real-life conditions, allowing for natural variations in smartphone orientation and placement, to improve the external validity and generalizability of human activity recognition models. (NA?)

Optimizing Data Analysis Techniques for Biological Research

> Enhancing Data Quality & Integration through Best Practices

>> Optimal Computational Platform Selection & Benchmarking

Develop benchmarks that reflect fundamental kernels with broad application reach, map to real world problems, and use real data sets with important patterns to ensure that the results are meaningful and applicable to real-world scenarios. (Ueno et al. 2016)

> Enhancing Genomic Data Analysis Through Parallel Computing & Machine Learning

>> Parallel Computation with genlight Class on Large SNP Datasets

Consider leveraging parallel computing techniques to efficiently analyze large genome-wide SNP datasets, particularly through the use of the genlight object class provided by the adegenet package, which enables efficient data representation and parallel computation. (Jombart and Ahmed 2011)

> Improving Bioinformatics Tools & Techniques

>> Optimal Use of Non-Redundant Datasets in Phylogenetics

Prioritize the use of the non-redundant (NR) version of the SSU Ref dataset for rRNA gene-based classification, phylogenetic analysis, and probe design, as it is significantly smaller and has a more even phylogenetic distribution compared to the full Ref dataset. (NA?)

>> Enhancing Sequencing Data Processing through Advanced Algorithms

Consider using the BBMerge tool when working with paired-end shotgun reads, as it outperforms existing merging tools in terms of accuracy and speed, and offers unique features like the ability to merge non-overlapping read pairs using \(k\)-mer frequency information. (NA?)

> Statistical Tools for Analyzing Microbiome and Genomic Data

>> Statistical methods for large-scale genomic & metagenomic analysis

Carefully consider the appropriate statistical method when analyzing large-scale genomic and metagenomic data, taking into account factors such as group size and the nature of the data, and apply multiple correction techniques to control for false positives. (NA?)

> Enhancing Phylogenetic Tree Visualization with iTOL Updates

>> Improvements in iTOL Versions for Large Phylogenetic Trees

Consider using the updated Interactive Tree Of Life (iTOL) v5 tool for displaying, manipulating, and annotating phylogenetic trees due to its improved tree display engine, expanded annotation options, direct manual annotation capabilities, and enhanced user account system. (Letunic and Bork 2021)
Consider using Interactive Tree Of Life (iTOL) v3, an online tool that allows for efficient display, manipulation, and annotation of large phylogenetic trees (up to 100,000 leaves), while providing advanced features such as support for new data types, interactive control over annotation positioning, and an account system for managing trees in user-defined workspaces. (Letunic and Bork 2016)
Consider utilizing the updated Interactive Tree Of Life (iTOL) v4 platform for displaying, manipulating, and annotating phylogenetic trees due to its enhanced features, including support for new dataset types, expanded annotation options, and improved user controls for display elements. (NA?)

> Text Mining Strategies for Enhancing Biomedical Information Extraction

>> Text Mining Approaches for Efficient and Reliable Biomedical Literature Analysis

Prioritize using full-text articles rather than abstracts alone for text mining tasks, particularly for identifying disease-gene associations, as this approach leads to improved performance metrics such as higher true positive rates and area under the receiver operating characteristic curve (AUC). (Westergaard et al. 2017)

>> Standardization & Annotation Techniques for Biomedical Text Mining

Utilize standardized event-based representations and file formats when conducting information extraction tasks in the biomedical domain, as this facilitates system reuse across tasks and promotes collaboration within the community. (“UZH in BioNLP 2013” 2013)

Optimizing Data Analysis and Decision Support Systems

> Enhancing Clinical Text Mining via Advanced NLP Techniques

>> Improving Named Entity Recognition using Machine Learning Approaches

Consider employing a multi-component prompt framework when working with large language models for clinical named entity recognition tasks, as it leads to significant improvements in model performance through the incorporation of task-specific information, annotation guidelines, error analysis-based instructions, and annotated samples. (Hu et al. 2023)
Prioritize the use of pretrained language models (LM) when extracting social determinants of health (SDOH) from clinical notes, as they offer superior performance compared to traditional rule-based methods and other machine learning algorithms, particularly for complex SDOH constructs such as substance use and homelessness. (Lybarger, Yetisgen, and Uzuner 2023)

> Enhancing Medical Decision Making Through Advanced Technologies

>> Leveraging Machine Learning and Structured Risk Assessment

Employ machine learning models like BERT and XGBoost to process vast amounts of natural language data from sources such as Hacker News and Common Vulnerabilities and Exposures (CVE) reports, in order to effectively extract and evaluate the level of identified threats and vulnerabilities that may impact the healthcare system. (Silvestri et al. 2023)

>> Designing Efficient Medical DSS with User Interface Focus

Carefully consider the target decision-makers, user interface design, data sources, algorithms, and additional resources when developing medical decision support systems (DSS) to ensure accurate and efficient decision making in various medical scenarios. (NA?)

>> AI-Generated Suggestions for Clinical Decision Support Logic

Consider incorporating AI-generated suggestions alongside human-generated ones when optimizing clinical decision support (CDS) alerts, as demonstrated by the finding that out of the top 20 highest-scoring suggestions, 9 were produced by ChatGPT. (S. Liu et al. 2023)
Consider incorporating AI-generated suggestions, specifically those produced by large language models like ChatGPT, as a valuable supplement to human-generated suggestions for optimizing clinical decision support (CDS) logic. (NA?)

> Enhancing Data Quality and Interpretation through Various Approaches

>> FAIRifying Health Data Management for Privacy Protection

Follow the proposed FAIRification workflow for health data, which includes additional steps for data curation, validation, deidentification, pseudonymization, and indexing compared to the original GO FAIR process, to ensure that health data is managed in a way that is findable, accessible, interoperable, and reusable while protecting patient privacy and complying with legal and ethical requirements. (NA?)

>> Standardizing Core Datasets for Multiple Sclerosis Studies

Adopt a standardized core dataset for multiple sclerosis (MS) studies, comprising 44 essential variables across eight categories, to improve data harmonization and facilitate collaboration among MS registries and cohorts. (NA?)

>> Multiple Analyst Review for Robust Results and Moderator Identification

Consider adopting a “multi-analyst approach,” where multiple independent analysts evaluate the same data to increase confidence in the robustness of results and identify potentially meaningful moderators of the results when discrepancies arise. (Aczel et al. 2021)

> Ethics and Enhancement in Medical Diagnostics

>> AI Authorship & Text Generation Ethics in Medical Journals

Not list large language models (LLMs) like ChatGPT as co-authors or include AI-generated text in your submissions to the Korean Journal of Radiology due to ethical concerns raised by leading academic journals. (Flanagin et al. 2023)

> AI Ethics & Regulation in Healthcare Education & Devices

>> AI Ethics in Dental Education: Plagiarism & Copyright

Carefully consider the ethical and legal implications of using AI in dental education, including issues of plagiarism and copyright, and ensure that proper citation practices are followed when using AI-generated text. (NA?)

>> Regulatory Challenges for AI/ML in Medical Applications

Advocate for nuanced and adaptive regulatory approaches when implementing large language models (LLMs) in healthcare, taking into consideration factors like data privacy, societal impact, and potential biases, while balancing the need for innovation and patient safety. (NA?)
Consider the implications of the predominant use of the 510(k)-clearance pathway for AI/ML-enabled medical devices, which emphasizes substantial equivalence rather than requiring new clinical trials, and recognize the potential limitations of this approach for ensuring safety and effectiveness. (NA?)

Predictive Modeling Techniques for Medical Data

> Advanced Strategies for Handling Complexity in Medical Prediction

>> Time-Dependent ROC Curves for Censored Time-To-Event Analysis

Consider using time-dependent ROC curves for evaluating the performance of diagnostic markers in time-to-event analyses, especially when dealing with censored data, as these curves can capture the dynamic nature of diagnostic accuracy over time. (Heagerty, Lumley, and Pepe 2000)

>> Survival Analysis Models with Fairness Constraints & Machine Learning

Consider using a general machine learning framework for survival analysis based on piece-wise exponential models, which enables the incorporation of various complexities such as censoring, truncation, time-varying features, and competing risks through data augmentation and interaction terms, ultimately allowing for the use of a wide range of algorithms optimized for Poisson regression tasks. (Bender et al. 2021)
Incorporate fairness constraints in survival analysis models by minimizing the mutual information between predicted time-to-event and sensitive attributes to promote statistical independence and reduce potential disparities in predictions. (I. Y. Chen et al. 2021)

>> Individual Survival Distribution Models for Patient Prognosis

Consider using individual survival distribution (isd) models for making accurate predictions of patient survival probabilities across multiple time points, as these models take into account individual patient characteristics and can improve upon traditional methods such as risk scores, single-time probability models, and population-average survival curves. (“Analysis-Ready Standardized TCGA Data from Broad GDAC Firehose 2016_01_28 Run” 2016)

>> Optimal Clinical Decision Making through Customized Predictive Models

Utilize a two-step Bayesian approach to optimize clinical decisions with timing, combining a generative model for medical interventions with a Bayesian joint framework that accounts for uncertainties in clinical observations, ultimately leading to improved patient survival. (Hua et al. 2022)
Explicitly define the predictimand, or the specific question about treatment effects that your clinical prediction model aims to answer, as this choice determines the appropriate statistical approach and ensures accurate interpretation of results. (Geloven et al. 2020)

> Optimizing Machine Learning Models for Healthcare Applications

>> Avoiding Spurious Correlations from Dataset Integration

Exercise caution when incorporating additional datasets in machine learning models, as doing so can sometimes introduce spurious correlations that negatively affect model performance, particularly in medical imaging contexts where hospital-specific image artifacts can create false associations between disease and hospital. (Berenguer et al. 2022)

> Deep Learning Approaches for Analyzing Electronic Health Records

>> Temporal Dynamics & Recurrent Neural Networks for EHR Analysis

Consider incorporating temporal dynamics when analyzing electronic health record (EHR) data, as doing so may improve model performance in predicting incident heart failure diagnosis compared to conventional methods that ignore temporality. (Choi et al. 2016)
Consider using recurrent neural networks (RNNs) for predicting multilabel event sequences in electronic health records (EHRs), as they offer several advantages over traditional methods such as discretizing time or using continuous-time Markov chain based models. (Choi et al. 2015)
Consider using Long Short-Term Memory (LSTM) recurrent neural networks for analyzing multivariate time series data, as demonstrated by its success in accurately classifying clinical diagnoses from electronic health records despite challenges such as varying length, irregular sampling, and missing data. (Lipton et al. 2015)

>> Unsupervised Feature Learning vs Supervised Algorithms

Consider using deep learning algorithms to analyze electronic health records (EHRs) due to your ability to automatically learn complex patterns and relationships from large amounts of data, potentially leading to improved predictive accuracy for various clinical outcomes. (Rajkomar et al. 2018)

> AI/ML-based Socio-Economic Administration with Data Protection

>> AI/ML models for socio-economic governance with data considerations

Employ a diverse set of methods, including quantitative-qualitative analysis, synthesis, abstraction, prediction, and experimental methods, to develop and evaluate AI/ML-based prediction mechanisms for improving government socio-economic administration, while carefully considering data quality, access, and protection issues. (Ivashchenko, Ivashchenko, and Vasylets 2023)

Enhancing Study Design & Reporting Transparency

> Optimal Data Collection Strategies Across Various Fields

>> Optimizing Breastfeeding Studies Through Controlled Suckling Events

Carefully control and document the timing and frequency of suckling events in studies involving breastfeeding women, as these factors greatly influence prolactin levels and thus potentially affect outcomes of interest. (Tay, Glasier, and McNeilly 1996)

>> Video Recording Studies: Best Practices for Effective Data Collection

Carefully plan and monitor video recording studies to ensure effective data collection, considering factors such as introduction, complexity, intrusion, and quantity. (NA?)

>> AI-Assisted CTG Interpretation Dataset Development with Privacy Considerations

Aim to collect a large, diverse, and accurately annotated dataset to enable effective development and evaluation of AI-assisted CTG interpretation systems, while carefully addressing issues related to data privacy and informed consent. (NA?)

> Strengthening Causal Inference through Experimental Design Strategies

>> Politics-aware and Large-Scale Randomized Experiments

Utilize large-scale randomized experiments whenever feasible to establish causality in the absence of discretionary spending, as demonstrated by the finding that programmatic policies have no discernible effect on voter support for incumbents. (Imai, King, and Rivera 2019)
Consider using a stepped wedge experimental design when studying real-world health policy programs, as it allows for randomization within the context of a phased rollout, making it politically feasible and ethical. (Wirtz et al. 2012)
Anticipate and proactively address potential political challenges to your experimental designs by building in flexibility and redundancy, such as fallback options and multiple sources of data, to ensure the integrity and validity of your findings. (G. King et al. 2007)

>> Addressing Spillovers & Noncompliance in Treatment Effect Estimation

Account for both direct and indirect (spillover) effects of treatment assignment on treatment receipt and outcomes, especially in cases where noncompliance is expected, and utilize appropriate statistical methods to identify and estimate these effects. (Imai, Jiang, and Malani 2020)
Consider using randomized controlled trials to evaluate the impact of health insurance programs, particularly when investigating complex phenomena such as spillover effects and the potential trade-offs between cost-sharing and utilization. (Alexander 2020)

> Causal Inference Techniques for Longitudinal Data & Natural Experiments

>> Latent Markov Models for Analyzing Nursing Home Health Outcomes

Consider using latent Markov (LM) models for analyzing longitudinal data on nursing homes, as these models can describe individual changes in health status over time and account for the influence of different nursing homes on health outcomes. (Bartolucci, Lupparelli, and Montanari 2009)

>> Addressing Confounders in Natural Experiments Using Control Groups

Carefully consider the potential confounding factors when using natural experiments such as field office closures, and ensure that the timing of the event is uncorrelated with changes in the outcome variable by using appropriate control groups and statistical methods. (Deshpande and Li 2019)

>> Natural Experiments Address Endogeneity in Healthcare Competition Studies

Leverage natural experiments, like policy changes, to overcome endogeneity issues when estimating the effects of competition on healthcare quality. (Gaynor, Moreno-Serra, and Propper 2013)

>> Quasi-Experimental DiD Approach for Evaluating Complex Interventions

Employ a difference-in-differences (DiD) approach to evaluate complex interventions like the Delivering Choice Programme (DCP), which allows for comparison of changes over time between non-random populations by contrasting the treatment group before and after an intervention with a control group from a suitably matched comparator control site that did not receive the intervention. (Round et al. 2013)

>> Difference-in-Differences Analysis for Controlling Unobserved Confounding Factors

Consider using a difference-in-differences analysis when evaluating the impact of an intervention or policy change on an outcome of interest, as it can help control for unobserved factors that might otherwise confound the results. (Vermeulen et al. 2015)

>> Difference-in-Differences Analysis for Treatment Effect Estimation

Employ a difference-in-differences analysis to estimate changes in spending or quality in the treatment group (ACOs) from the precontract period to the postcontract period that differ from concurrent changes in the control group, while controlling for geographic area and changes in observed sociodemographic and clinical characteristics of beneficiaries. (McWilliams et al. 2015)

> Food Environment Studies: Advanced Analytic Techniques

>> Improving Measurement Tools Validity via GIS Solutions

Prioritize the use of validated and reliable measurement tools when studying the impact of community food environments on health outcomes, specifically through the application of Geographic Information Systems (GIS)-based solutions such as the Facility List Coder (FLC) tool, which improves the quality and standardization of food environment measurements. (NA?)

>> Fixed Effects Models Minimizing Biases in Child Obesity Studies

Employ fixed-effects models with longitudinal data to minimize biases from residential self-selection and unobserved heterogeneity when investigating the relationship between food environments and childhood obesity. (Mölenberg et al. 2021)

> Optimizing Systematic Reviews through Efficient Techniques

>> Maximizing Knowledge Integration via Comprehensive Systematic Review

Conduct a thorough systematic review before embarking on new studies, ensuring that your work builds upon existing knowledge rather than duplicating efforts or missing crucial context provided by previous research. (M. Clarke 2004)

>> Leveraging AI and Human Expertise in Systematic Review Processes

Consider incorporating high-quality systematic review query examples in your prompts when using ChatGPT for query formulation, as it significantly outperforms prompts without examples in terms of F_1, F_3, and recall. (J. Wang et al. 2023)

> Mitigating Publication Bias & Selective Reporting

>> Preventing Outcome Reporting Bias in Clinical Trials

Publish your protocols and statistical analysis plans before conducting clinical trials, and strictly adhere to them during analysis to avoid selective reporting biases caused by discrepancies in analyses between publications and other study documentation. (Dwan et al. 2014)
Compare study protocols to final publications to detect and mitigate the impact of outcome reporting bias, which has been found to increase the likelihood of statistically significant results being fully reported by a factor of 2.2 to 4.7. (Dwan et al. 2013)
Carefully define and specify primary outcomes in your protocols, avoid selective reporting of statistically significant results, and ensure complete and transparent reporting of all outcomes to prevent outcome reporting bias in randomized controlled trials. (Chan 2004)

>> Addressing Publication Bias in Systematic Reviews

Consider the impact of dissemination bias when conducting systematic reviews, as a substantial portion of studies approved by RECs or included in trial registries remain unpublished, leading to selective reporting of results and potentially skewed conclusions. (C. Schmucker et al. 2014)
Prioritize the timely submission of your studies to peer-reviewed journals to prevent non-publication, which is often attributed to lack of time or low priority. (Fujian Song, Loke, and Hooper 2014)
Employ a multipronged approach to minimize publication bias, including registering trials at inception, searching for unpublished studies through various channels, and encouraging journals to publish high-quality studies irrespective of novelty or exciting results. (Fujian Song, Hooper, and Loke 2013)
Employ rigorous and transparent methods to identify, select, extract, and analyze data in order to minimize various forms of bias in systematic reviews, such as publication bias, language bias, and time-lag bias. (Tricco et al. 2008)

>> Bayesian Inference

Proactively register your studies prospectively, search thoroughly for unpublished studies, and consider using statistical methods to assess and mitigate publication bias in your analyses. (F. Song et al. 2010)
Consider using Bayesian methods in health technology assessment because they enable explicit quantification of uncertainty and incorporation of prior knowledge, which can improve decision-making and increase the efficiency of research. (McCarron et al. 2009)

>> Inclusion of Unpublished Data in Meta-Analysis

Carefully consider the potential impact of including unpublished and grey literature study data in meta-analyses, as these sources may introduce biases and affect the pooled effect estimates and overall interpretation of the results. (C. M. Schmucker et al. 2017)

>> Transparency in Clinical Trials through Document Access

Ensure transparency and accountability in your work by promptly addressing and disclosing any instances of noncompliance or misconduct discovered by regulatory bodies like the FDA, rather than allowing them to go unreported in the peer-reviewed literature. (Seife 2015)
Strive for complete transparency in clinical trials by providing open access to essential trial documents such as protocols and regulatory agency submissions to ensure accurate interpretation and validation of results. (Dwan et al. 2008)
Carefully compare the characteristics of clinical trials reported in regulatory submissions to the FDA with those reported in published journal articles, as discrepancies in primary outcomes, statistical significance, and conclusions may indicate selective reporting or publication bias. (n.d.)

>> Promoting Clinical Utility & Preregistration

Consider preregistering your clinical trials to improve research credibility, as evidenced by the absence of p-hacking in preregistered trials compared to non-preregistered trials, even when controlling for other design characteristics and sponsor fixed effects. (Decker and Ottaviani 2023)
Focus on producing clinically useful research by carefully considering the problem base, context placement, information gain, pragmatism, patient centeredness, value for money, feasibility, and transparency of your studies. (Ioannidis 2016)

> Interdisciplinary Approaches to Strengthen Causal Inference

>> Cross-cultural Validation & Minimally Important Difference Estimation

Utilize multiple anchor-based methods with relevant clinical or patient-based indicators, along with distribution-based estimates, to triangulate on a single value or small range of values for the minimally important difference (MID) of a patient-reported outcome (PRO) measure, recognizing that the MID varies by population and context and requires ongoing validation through additional research evidence. (Revicki et al. 2006)

>> Interdisciplinary Economics-Epidemiology Framework for Impact Evaluation

Consider adopting a more interdisciplinary approach to impact evaluation, combining the strengths of both economics and epidemiology, such as utilizing theoretical models to derive testable assumptions and ensuring rigorous identification of causal pathways in economics, along with the use of visual logic models and systematic reviews in epidemiology, to enhance the generalizability and transferability of evaluation outcomes. (Bärnighausen et al. 2017)

> Data Integration & Quality Assurance in Heterogeneous Sources

>> Data Quality Prioritization for Research Assessment

Prioritize data quality, specifically focusing on relevance, accuracy, credibility, timeliness, accessibility, interpretability, and coherence when collecting and analyzing data for research and innovation assessment, particularly when dealing with heterogeneous data sources. (NA?)

> Promoting Transparent Research Practices through Checklists

>> Observational Studies with STROBE/STARD Checklist Compliance

Adhere to stringent reporting standards, including completing relevant checklists like STROBE and STARD, and sharing data, to improve the transparency and reliability of observational studies. (“Observational Studies: Getting Clear about Transparency” 2014)

>> Consensus-Based Comprehensive Transparency Checklist

Utilize a consensus-based, comprehensive transparency checklist to ensure openness and accountability throughout the entire research process, including preregistration, methods, results and discussion, and data, code and materials availability. (Aczel et al. 2019)

Integrating Language Models for Enhanced Robot Capabilities

> Optimizing Robotic Control Through Advanced ML Techniques

>> GPU-Accelerated Training Pipelines for Complex Robot Simulations

Consider implementing an end-to-end GPU-accelerated training pipeline for robotics simulations to achieve significant speed-ups in training complex environments, as demonstrated by the development of Isaac Gym. (Makoviychuk et al. 2021)

> Leveraging Advanced AI Techniques for Autonomous Vehicles

>> Dynamic Probabilistic Networks & Reinforcement Learning for Autonomous Control

Use dynamic probabilistic networks (DPNs) to maintain and update the current belief state of an autonomous vehicle, as DPNs enable efficient real-time temporal inference by exploiting the Markov property and allow for the integration of multiple sensor readings to estimate the value of a measured variable. (Knott et al. 2023)

>> Utilizing MLLMs and LLMs for Autonomous Driving Systems

Consider incorporating large language models (LLMs) as the decision-making “brain” in autonomous vehicles, complemented by various tools within the autonomous vehicle ecosystem as the vehicles sensory “eyes”, to enable informed decision-making and enhance the performance of autonomous vehicles. (Cui et al. 2023)
Consider utilizing large language models (LLMs) like ChatGPT for intelligent traffic safety analysis, specifically for tasks such as accident report automation, traffic data augmentation, and multisensory safety analysis, while being mindful of potential risks such as model bias, data privacy, and artificial hallucination. (O. Zheng et al. 2023)
Consider leveraging Multimodal Large Language Models (MLLMs) in autonomous driving systems, as they can effectively integrate and analyze diverse data modalities, improving perception, motion planning, and motion control, while also enhancing human-vehicle interaction and personalization. (Chung et al. 2014)

> Combining Language Models for Advanced Robotic Task Execution

>> Combining Different AI Models for Complex Robotic Tasks

Consider combining large language models (LLMs) and vision-language models (VLMs) in a hierarchical framework to enable real-time replanning capabilities for long-horizon tasks, allowing robots to adapt to unforeseen obstacles and achieve open-ended goals. (Skreta et al. 2024)
Consider using online reinforcement learning to functionally ground large language models in interactive environments, allowing for improved performance in solving goals and better generalization across tasks and objects. (Carta et al. 2023)
Consider combining vision-language models (VLMs) and text-to-video models to enable video language planning (VLP), which allows for scalable generation of long-horizon video plans by leveraging the strengths of both types of models: VLMs for proposing abstract text actions and evaluating your potential effectiveness, and text-to-video models for accurately synthesizing possible future world states. (Yilun Du et al. 2023)
Consider incorporating grounded models alongside large language models when working with embodied agents, as this combination allows for the integration of semantic knowledge and real-world constraints, leading to improved performance on complex, long-horizon tasks. (W. Huang et al. 2023)
Incorporate geometric feasibility planning during the search phase of language-based planning frameworks to resolve geometric dependencies spanning skill sequences, leading to improved performance in long-horizon reasoning tasks. (Lin et al. 2023)
Consider combining large language models (LLMs) with sampling-based robot planners to efficiently generate diverse and rich manipulation trajectories, while also incorporating a verification and retry mechanism to ensure successful task completion and add valuable recovery experience for downstream policy distillation. (Todorov, Erez, and Tassa 2012)

>> Integration of Multiple Pre-Trained Models for Robotic Navigation

Consider incorporating multiple sensory modalities, such as audio and visual cues, into your studies to enhance the accuracy and robustness of your findings. (C. Huang et al. 2023)
Consider combining multiple pre-trained models, specifically a large language model, a vision-language model, and a visual navigation model, to develop a robotic navigation system capable of executing natural language instructions without any user-annotated navigational data. (W. Huang et al. 2022)
Consider integrating pretrained visual-language models with 3D reconstructions of the physical world to create spatial map representations (VLMaps) that enable natural language indexing of the map, allowing for more complex language instructions to be followed during robot navigation. (C. Huang et al. 2022)

>> Utilizing LLMs for Code Generation & Complex Robot Actions

Consider using customizable prompts for ChatGPT to improve the accuracy and efficiency of generating executable robot actions from natural language instructions, particularly in complex environments requiring multi-step planning. (Wake et al. 2023)
Consider utilizing hierarchical code-writing large language models (LLMs) for generating code policies, as they outperform flat code-generation methods in solving complex problems, especially in robotics domains, and can effectively leverage third-party libraries for enhanced reasoning capabilities. (Liang et al. 2022)
Consider using programming language-inspired prompt generators to inform large language models of both situated environment state and available robot actions, ensuring output compatibility to robot actions. (Singh et al. 2022)

>> Leveraging Vision-Language Models for Semantic Reasoning and Decision Making

Consider co-fine-tuning large vision-language models on both robotic trajectory data and Internet-scale vision-language tasks to improve generalization and enable emergent semantic reasoning in robotic systems. (Brohan et al. 2023)
Consider using embodied language models that directly incorporate real-world continuous sensor modalities into language models to improve grounding and enable better real-world decision making. (Driess et al. 2023)
Consider using a combination of language-conditioned visual reconstruction and visually-grounded language generation to optimize the trade-off between low-level spatial information and high-level semantic information in order to improve the performance of visual representations for robotics across a variety of tasks. (Karamcheti et al. 2023)

>> Leverage LLMs for Planning, Data Augmentation, Adaptation, and Zero-Shot Learning

Leverage large language models (LLMs) to extract commonsense knowledge for planning, specifically for object rearrangement tasks, and integrate this knowledge with a task and motion planner to enable robots to make semantically valid decisions about object placements in complex environments. (Ding et al. 2023)
Consider leveraging advanced text-to-image diffusion models for data augmentation in robot learning, allowing for the creation of diverse and semantically meaningful synthetic data without the need for costly real-world data collection or complex simulations. (Yu et al. 2023)
Consider leveraging the summarization capabilities of large language models (LLMs) to enable rapid adaptation and accurate personalization of robot behavior based on limited user interaction data. (NA?)
Consider leveraging pre-trained web-scale diffusion models, such as DALL-E, as imagination engines for robots, enabling zero-shot learning of complex tasks like object rearrangement without the need for extensive labeled data or additional training. (NA?)

> Language Model Applications for Embodied Agents and Planning

>> Language Model Assistance for Efficient Planning in Embodied Agents

Consider combining the strengths of large scale language models (LSLMs) for reasoning and adaptability with those of reinforcement learning (RL) agents for embodiment and control in order to achieve improved performance on complex tasks in ambiguous environments. (Dasgupta et al. 2023)
Consider leveraging large language models (LLMs) to hypothesize abstract world models (AWMs) for reinforcement learning (RL) agents, which can then be verified and refined through agent experience, leading to improved sample efficiency and robustness to errors in the LLM. (Nottingham et al. 2023)
Consider using a combination of large language models and open-vocabulary object detectors to generate executable action plans for embodied agents in complex environments, as demonstrated by the proposed TaPA framework. (Wu et al. 2023)
Consider decomposing complex tasks into simpler, fine-grained skills and utilize a hierarchical approach to planning and learning those skills in order to achieve greater sample efficiency and overall performance in open-world environments. (Yuan et al. 2023)
Consider employing large language models (LLMs) for few-shot planning in embodied agents, as demonstrated by the proposed LLM-Planner method, which achieves competitive results using less than 0.5% of paired training data compared to recent baselines trained on the full dataset. (Ahn et al. 2022)

>> Leveraging Pre-Trained Language Models for Goal Selection & Reward Function

Consider framing success detection as a visual question answering (VQA) problem, utilizing large pretrained vision-language models (such as Flamingo) to improve generalization performance across varying languages and visual contexts. (Yuqing Du, Konyushkova, et al. 2023)
Incorporate human prior knowledge, represented by pretrained language models, to inform the selection of goals for exploration in order to improve the efficiency and effectiveness of reinforcement learning algorithms. (Yuqing Du, Watkins, et al. 2023)
Consider using large language models (LLMs) as a proxy reward function in reinforcement learning (RL) tasks, as these models can efficiently learn from few or zero examples and provide accurate reward signals based on user-defined objectives. (Kwon et al. 2023)

>> Assessing Large Language Models

Consider connecting large language models (LLMs) to external tools, such as classical planners, to improve your functional competence and avoid altering the LLMs themselves. (B. Liu et al. 2023)
Develop and utilize benchmark suites based on established domains to systematically evaluate the planning capabilities of large language models (LLMs) in a rigorous and comparable manner. (Valmeekam et al. 2023)
Carefully evaluate the performance of large language models (LLMs) in translating natural language goals to structured planning languages, taking into account the models ability to handle ambiguity, commonsense reasoning, numerical and physical reasoning, and sensitivity to prompts. (Xie et al. 2023)
Utilize a comprehensive and rigorous assessment framework to evaluate the reasoning capabilities of large language models (LLMs) on complex planning tasks, rather than relying solely on simple benchmarks or anecdotal evidence. (Valmeekam et al. 2022)

>> Language Model Techniques for Advanced Task Solving

Explore the integration of large language models (LLMs) with external symbolic modules to enhance the performance of LLMs in text-based games involving symbolic tasks, as demonstrated by the proposed LLM agent achieving an average performance of 88% across all tasks. (Fang et al. 2024)
Consider using recursive criticism and improvement (RCI) prompting with large language models (LLMs) to improve task grounding, state grounding, and agent grounding in order to develop more efficient and effective computer task-solving agents. (G. Kim, Baldi, and McAleer 2023)
Consider combining foundation models, which are pretrained on vast amounts of data, with sequential decision making techniques, such as reinforcement learning, to enable more efficient and generalized learning in complex, interactive environments. (Yang et al. 2023)
Consider implementing a “Tree of Thoughts” (ToT) framework for general problem solving with language models, which enables active maintenance of a tree of coherent language sequences that serve as intermediate steps toward problem solving, allowing for self-evaluation, lookahead, and backtracking to make more informed decisions. (Yao et al. 2023)

Optimizing Data Mining Techniques for Enhanced Analysis

> Time Series Analysis & Clustering Methodologies

>> Robust Evaluation Strategies for Time Series Representation

Conduct multiple resampling experiments to avoid overinterpreting results due to a single train/test split, especially when dealing with small data set sizes and tiny numerical differences. (Bagnall et al. 2016)
Be aware of potential inconsistencies and discrepancies in the literature when comparing time series representation methods and similarity measures, and conduct thorough and rigorous evaluations using multiple datasets to ensure validity and reliability of results. (X. Wang et al. 2010)
Avoid drawing conclusions based solely on a single train/test split and instead utilize repeated resampling to ensure robustness and reliability of results. (NA?)

> Change Point Detection Methodologies in Time Series Data

>> Optimal Selection of Parameters & Approaches in Change Point Detection

Carefully consider the choice of cost function, search method, and constraint on the number of changes when selecting or developing an offline change point detection algorithm for multivariate time series data. (Truong, Oudre, and Vayatis 2020)
Carefully consider the trade-offs between speed and accuracy when selecting a changepoint search method, as the paper compares three algorithms (binary segmentation, segment neighborhood, and PELT) and shows that while binary segmentation is faster, it may sacrifice accuracy compared to the other two methods. (Killick and Eckley 2014)
Consider using a Bayesian approach to online changepoint detection, specifically one that employs a message-passing algorithm to estimate the probability distribution of the length of the current “run,” or time since the last changepoint, under the assumption of independence between model parameters before and after the changepoint. (Adams and MacKay 2007)

> Time Series Analysis & Forecasting Methodologies

>> \(\ell_{1}\) Trend Filtering for Piecewise Linear Estimation

Consider using the proposed \(\ell_{1\) trend filtering method instead of traditional Hodrick-Prescott (H-P) filtering for estimating trends in time series data, particularly when the underlying trend is piecewise linear. This method uses an \(\ell_{1\)-norm penalty instead of an \(\ell_{2\)-norm penalty to produce trend estimates that are piecewise linear, allowing for easier interpretation of abrupt changes or events in the underlying dynamics of the time series (S.-J. Kim et al. 2009)

>> Interpretable Models & Residual Analysis for Accurate Forecasting

Carefully distinguish between the underlying mean and the residuals in your time series analysis, as accurately estimating the former and understanding the latters dependence structure are crucial steps towards building reliable confidence intervals and effectively modeling the data. (Dahlhaus, Richter, and Wu 2017)
Employ a modular regression model with interpretable parameters that can be intuitively adjusted by analysts with domain knowledge, combined with performance analyses to compare and evaluate forecasting procedures, and tools to automatically flag forecasts for manual review and adjustment. (Taylor and Letham 2017)

>> Transformations and Approximation Techniques for Accurate Forecasting

Carefully assess whether the log transformation truly stabilizes the variance of your time series before deciding to use it for forecasting, as failing to do so could result in reduced forecast precision. (Lütkepohl and Xu 2010)
Consider using a two-step approach to approximate multi-step ahead density forecasts for non-Gaussian data, where the first step involves modeling the dynamics of the conditional mean and variance using a Gaussian model, and the second step uses these estimates to derive a parametric approximation to the true density function. (Lau and McSharry 2010)

> Combining Evidence & Boosting Features in Sequence Analysis

>> Combinatorial Approaches for Multiple Motif Detection

Consider using the product of independent p-values as a test statistic for combining evidence from multiple sources, particularly when dealing with sequence homology searches, as it provides an effective and efficient means of integrating information from various motifs and improves the overall sensitivity and specificity of the analysis. (Bayat 2002)

> Efficient Pattern Recognition Methodologies for Large Datasets

>> Optimal Frequency-based Approaches for Association Rule Discovery

Consider analyzing emerging patterns (EPs) in your data, which are itemsets with significant changes in support across two datasets, rather than just focusing on frequent itemsets or association rules. (Dong and Li 1999)

>> Depth-First Search & Bitmap Representation for Long Sequences

Consider using a depth-first search strategy combined with a bitmap representation of the database for efficiently mining long sequential patterns, particularly when dealing with large datasets. (Ayres et al. 2002)

> Categorical Clustering Methodologies for Advanced Data Analysis

>> Clustering Categorical Data via Extended LDA & K-Modes

Consider using the Rlda package for mixed-membership clustering analysis of categorical data, which extends the traditional Latent Dirichlet Allocation (LDA) model to handle Multinomial, Bernoulli, and Binomial data types, and allows for the selection of the optimal number of clusters based on a truncated stick-breaking prior approach. (Albuquerque, Valle, and Li 2019)

Wireless Network Optimization & Machine Learning Applications

> Location Estimation Techniques in Wireless Networks

>> Machine Learning Methods for Accurate Device Localization

Consider using hierarchical Bayesian graphical models for wireless location estimation, as they can accurately estimate locations without any location information in the training data, resulting in a truly adaptive, zero-profiling technique. (Madigan et al., n.d.)

>> Benchmarking Reliability of MAC Address Association Frameworks

Develop and employ benchmarks to measure the reliability of MAC address association frameworks across different datasets, taking into account the impact of device heterogeneity on randomization complexity. (Mishra, Viana, and Achir 2023)

> Optimal Filter Design, SDR Platforms, Adaptive Communications

>> Optimal Balance Preservation in FIR Filters for ECG Signals

Prioritize finding the optimal balance between preserving signal quality and reducing hardware requirements when designing FIR filters for specific applications, such as filtering ECG signals. (Meidani and Mashoufi 2016)

> Optimal Resource Management & Model Selection

>> Edge AI Systems Design & Performance Tradeoffs

Carefully consider the tradeoff between computational resource consumption and perceived image quality when developing AIGC services for wireless edge networks, as excessive inference steps can incur unnecessary resource consumption without necessarily improving perceived image quality. (H. Du et al. 2023)
Leverage the advanced language understanding, planning, and code generation abilities of large language models like GPT to build autonomous edge AI systems that can accurately comprehend user demands, efficiently execute AI models with minimal latency, and automatically generate high-performing AI models in a privacy-preserving manner. (Shen et al. 2023)

Improving Software Development Using Language Models

> Improving Failure Diagnosis in Complex Systems

>> Microservice Failure Root Cause Localization using Temporal Algorithms

Consider combining a path condition time series (PCTS) algorithm with a temporal cause oriented random walk (TCORW) approach to improve the accuracy of failure root cause localization in microservices, specifically by incorporating propagation delays and prioritizing causal relationships based on domain knowledge. (Kao 2017)

>> Trade-Offs Between Fine-Tuned vs Zero-Shot Learning

Carefully consider the trade-offs between fine-tuning large language models versus using them in zero-shot or few-shot settings, as fine-tuning can significantly improve your performance for specific tasks, but comes at the cost of environmental impact and reduced generalizability. (Ahmed et al. 2023)

> Automating Bug Detection & Triaging with Machine Learning

>> Fine-Tuning Large Transformers for Automated Debugging

Leverage large, pretrained transformers and fine-tune them on specific tasks using relevant data sources, such as synthetic bugs generated from reversed commit data and rich debugging information obtained from executing tests, to improve the accuracy and efficiency of automated debugging systems. (Drain et al. 2021)

> Enhancing Fuzz Testing with Machine Learning Techniques

>> Leverage Large Language Models for Protocol Fuzzing

Consider leveraging large language models (LLMs) to guide your protocol fuzzing efforts, specifically through extracting machine-readable grammars for protocol messages, increasing the diversity of messages in recorded message sequences, and breaking out of coverage plateaus by having the LLM generate messages to reach new states. (Deng et al. 2023)

>> Two-Phase Method for Generating Vulnerable Commit Datasets

Consider using a two-phase method for generating vulnerability introducing commit datasets, consisting of an initial phase that identifies candidate commits using the SZZ algorithm followed by a second phase that filters and ranks those candidates based on a relevance score metric. (Aladics, Hegedűs, and Ferenc 2021)

> Enhancing Large Language Model Performance in Automated Program Repair

>> Evaluating Functional Correctness of Synthesized Code

Employ rigorous and diverse testing methods, such as combining LLM-based and traditional mutation-based test input generation, to thoroughly evaluate the functional correctness of LLM-synthesized code and avoid overestimation caused by insufficient testing. (J. Liu et al. 2023)

>> Monitoring Behavior Variations for Reproducible Integration

Continuously monitor the behavior of large language models (LLMs) over time, as significant variations in performance and behavior can occur, potentially causing issues in reproducibility and stability of integrating LLMs into larger workflows. (L. Chen, Zaharia, and Zou 2023)

>> Leveraging Multilingual Models & Domain Knowledge Libraries

Consider incorporating domain-specific knowledge libraries into your large language models to improve your ability to handle complex and novel programming problems. (T. Huang et al. 2024)
Consider developing and utilizing multilingual models for code generation tasks, as evidenced by the superior performance of CodeGeeX over monolingual models on the HumanEval-X benchmark. (Q. Zheng et al. 2023)

>> Innovative Approaches to Code Generation

Incorporate a sketch-based approach in your code generation models, where a code sketch is extracted from a similar code snippet and then edited based on the input description, rather than simply copying from the similar code. This approach allows for the extraction of relevant content while avoiding irrelevant parts, leading to improved performance in code generation tasks. (Li et al. 2023)
Consider utilizing a flexible encoder-decoder architecture for code large language models, which can be operated in encoder-only, decoder-only, or encoder-decoder mode, and incorporating a diverse mixture of pretraining objectives on unimodal and bimodal data to effectively transfer learned representations to various downstream tasks. (Y. Wang et al. 2023)
Consider incorporating planning algorithms into the Transformer generation process for code generation tasks, allowing for more informed decisions and improved program quality through lookahead search and testing on public test cases. (Zhang et al. 2023)
Carefully consider and address potential sources of experimental bias, such as dataset duplication and overlapping inputs, to ensure accurate and reliable evaluations of automated program generation models. (NA?)

>> Addressing Limitations & Optimizing Techniques for LLM Debugging

Provide clear and specific information about the intended functionality of deep learning programs to improve the effectiveness of automated debugging tools like ChatGPT. (Cao et al. 2023)
Consider leveraging the interactive capabilities of large language models like ChatGPT to improve your performance in automated program repair tasks, as demonstrated by the significant increase in bug fixing success rates when providing additional context and feedback to the model. (Sobania et al. 2023)
Carefully consider the potential for data leakage when evaluating large language models (LLMs) on programming tasks, as the models may have already learned the solution references provided in the benchmarks during your pre-training phase. (Tian et al. 2023)

>> Fine-Tuning Pre-Trained Code Language Models

Consider fine-tuning pre-trained code language models using APR-specific data to improve your performance in automated program repair tasks. (Jiang et al. 2023)
Consider using a conversational APR approach when working with LLMs for automated program repair, as it enables the model to learn from past mistakes and improve the quality of generated patches by incorporating validation feedback in a conversational manner. (Xia and Zhang 2023)

> Optimizing AI-Generated Code Explanation & Education

>> Error Detection & Correction using Large Language Models

Carefully evaluate the performance of large language models like Codex in generating improved programming error messages (PEMs) through rigorous testing and comparison with traditional methods, taking into account factors such as error type, program complexity, and temperature parameter tuning. (Leinonen, Hellas, et al. 2023)

>> Considerations for Design, Ethics, and Research Methodology

Carefully consider the potential benefits and drawbacks of incorporating AI code-generators in educational settings, as they may enhance learning outcomes for certain populations but also pose risks related to dependency, comprehension, and academic integrity. (Kazemitabaar et al. 2023)
Carefully consider the implications of AI-driven code generation tools on your experimental designs, particularly in terms of controlling for potential confounding variables introduced by the availability and use of these tools. (Becker et al. 2023)
Carefully consider the wording and structure of your problem prompts when working with code generation models like Copilot, as minor adjustments can significantly affect the accuracy of the generated code. (Denny, Kumar, and Giacaman 2022)

>> Evaluating Effectiveness and Types of AI-Generated Code Explanations

Consider using large language models (LLMs) to generate code explanations for educational purposes, as they have been found to be equally effective as human-generated explanations in terms of ideal length, and superior in terms of perceived accuracy and ease of understanding. (Leinonen, Denny, et al. 2023)
Consider leveraging artificial intelligence (AI) as a tool to efficiently generate varied examples and explanations, develop low-stakes tests, and facilitate distributed practice, while remaining vigilant against potential pitfalls such as AI hallucination and ensuring alignment with learning objectives and student abilities. (Mollick and Mollick 2023)
Carefully consider the type of code explanation generated by LLMs, as line-by-line explanations tend to be viewed more frequently but rated as less useful for learning compared to summary and concept explanations. (MacNeil et al. 2022)

> Automated Machine Learning with Large Language Models

>> Interpretable Feature Engineering using LLMs for AutoML

Consider incorporating large language models (LLMs) in your automated machine learning (AutoML) pipelines to enhance feature engineering and improve model performance, while ensuring interpretability and transparency through the generation of comments explaining the utility of each generated feature. (Hollmann, Müller, and Hutter 2023)

> Ontology Mapping, Visualization, and Management Techniques

>> Ontology Creation for Personalized Learning Pathways

Carefully consider the choice of ontology development methodology based on factors such as the size and complexity of the domain, availability of resources, and desired level of formalization, while also taking advantage of tools and techniques such as automated reasoning and ontology design patterns to ensure high-quality and interoperable ontologies. (Keet 2020)
Utilize Bayesian networks to map ontologies in e-learning contexts, enabling the creation of lightweight ontologies that can effectively represent the knowledge domain of a course and facilitate personalized learning paths. (NA?)

> Enhancing Software Engineering Processes through Text Analysis

>> Comparative Studies on Requirements Similarity & Model Transformation

Carefully evaluate and compare multiple language models for measuring requirements similarity, as the choice of model significantly impacts the correlation between requirements similarity and software similarity. (NA?)

> Machine Learning System Design & Best Practices

>> Machine Learning Systems Optimization Strategies

Employ a mixed-methods approach combining literature reviews, tool evaluations, and expert interviews to understand the principles, components, roles, and architecture needed for successful implementation of Machine Learning Operations (MLOps). (Kreuzberger, Kühl, and Hirschl 2022)
Consider utilizing weak supervision, specifically through the use of noisy, programmatically-generated training data, to address the common issue of limited labeled training data in machine learning projects. (Dehghani et al. 2017)
Prioritize careful consideration of the entire machine learning pipeline, including data collection, feature engineering, model selection, and model interpretation, when developing and managing machine learning models. (T. Chen et al. 2015)
Carefully consider and monitor the potential for technical debt in machine learning systems, particularly in terms of data dependencies, feedback loops, and system-level interactions, as these can lead to significant maintenance costs and unintended consequences over time. (Ananthanarayanan et al. 2013)

>> GPT Survey Validation for Market Research

Carefully consider the phrasing of prompts when using GPT for market research, as minor variations in wording can significantly affect the magnitude of responses, and validation in specific contexts is recommended before relying solely on GPT surveys for estimates of consumer preferences. (Aher, Arriaga, and Kalai 2022)

>> Weakly Supervised Relation Extraction via Osprey

Consider using a weakly-supervised approach, specifically the Osprey system, to significantly reduce the time and cost associated with developing and maintaining high-precision models for relation-extraction tasks, particularly in situations with extreme class imbalance and high labeling costs. (Kammoun et al. 2022)

> Enhancing Data Analysis & Service Interoperability with Metadata Standards

>> PROV Model for Tracking Provenance Metadata

Consider using the PROV model for tracking provenance metadata (i.e., information about entities, activities, agents, and your relationships) to ensure transparency, reproducibility, and trustworthiness in data analysis workflows. (Moreau and Groth 2013)

References

n.d. https://doi.org/10.1371/journal.pmed.0050201.t001.

Abdellaoui, Mohammed, Aurélien Baillon, Laetitia Placido, and Peter P Wakker. 2011. “The Rich Domain of Uncertainty: Source Functions and Their Experimental Implementation.” American Economic Review 101 (April). https://doi.org/10.1257/aer.101.2.695.

Aczel, Balazs, Barnabas Szaszi, Gustav Nilsonne, Olmo R van den Akker, Casper J Albers, Marcel ALM van Assen, Jojanneke A Bastiaansen, et al. 2021. “Consensus-Based Guidance for Conducting and Reporting Multi-Analyst Studies.” eLife 10 (November). https://doi.org/10.7554/elife.72185.

Aczel, Balazs, Barnabas Szaszi, Alexandra Sarafoglou, Zoltan Kekecs, Šimon Kucharský, Daniel Benjamin, Christopher D. Chambers, et al. 2019. “A Consensus-Based Transparency Checklist.” Nature Human Behaviour 4 (December). https://doi.org/10.1038/s41562-019-0772-6.

Adams, Ryan Prescott, and David J. C. MacKay. 2007. “Bayesian Online Changepoint Detection.” arXiv. https://doi.org/10.48550/ARXIV.0710.3742.

Agresti, Alan, and Brett Presnell. 2002. “Misvotes, Undervotes and Overvotes: The 2000 Presidential Election in Florida.” Statistical Science 17 (November). https://doi.org/10.1214/ss/1049993202.

Aher, Gati, Rosa I. Arriaga, and Adam Tauman Kalai. 2022. “Using Large Language Models to Simulate Multiple Humans and Replicate Human Subject Studies.” arXiv. https://doi.org/10.48550/ARXIV.2208.10264.

Ahmed, Toufique, Supriyo Ghosh, Chetan Bansal, Thomas Zimmermann, Xuchao Zhang, and Saravan Rajmohan. 2023. “Recommending Root-Cause and Mitigation Steps for Cloud Incidents Using Large Language Models.” arXiv. https://doi.org/10.48550/ARXIV.2301.03797.

Ahn, Michael, Anthony Brohan, Noah Brown, Yevgen Chebotar, Omar Cortes, Byron David, Chelsea Finn, et al. 2022. “Do as i Can, Not as i Say: Grounding Language in Robotic Affordances.” arXiv. https://doi.org/10.48550/ARXIV.2204.01691.

Aladics, Hegedűs, and Ferenc. 2021. “A Vulnerability Introducing Commit Dataset for Java: An Improved SZZ Based Approach,” December. https://doi.org/10.5281/ZENODO.5785239.

Albuquerque, Pedro H. M., Denis Ribeiro do Valle, and Daijiang Li. 2019. “Bayesian LDA for Mixed-Membership Clustering Analysis: The Rlda Package.” Knowledge-Based Systems 163 (January). https://doi.org/10.1016/j.knosys.2018.10.024.

Alexander, Diane. 2020. “How Do Doctors Respond to Incentives? Unintended Consequences of Paying Doctors to Reduce Costs.” Journal of Political Economy 128 (November). https://doi.org/10.1086/710334.

“Analysis-Ready Standardized TCGA Data from Broad GDAC Firehose 2016_01_28 Run.” 2016. https://doi.org/10.7908/C11G0KM9.

Ananthanarayanan, Rajagopal, Venkatesh Basker, Sumit Das, Ashish Gupta, Haifeng Jiang, Tianhao Qiu, Alexey Reznichenko, Deomid Ryabkov, Manpreet Singh, and Shivakumar Venkataraman. 2013. “Photon.” Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data, June. https://doi.org/10.1145/2463676.2465272.

Anderson, Michael, and Lucas Davis. 2021. “Uber and Alcohol-Related Traffic Fatalities,” July. https://doi.org/10.3386/w29071.

Ansolabehere, Stephen, and Eitan Hersh. 2012. “Validation: What Big Data Reveal about Survey Misreporting and the Real Electorate.” Political Analysis 20. https://doi.org/10.1093/pan/mps023.

Appelt, Kirstin C., Kerry F. Milch, Michel J. J. Handgraaf, and Elke U. Weber. 2011. “The Decision Making Individual Differences Inventory and Guidelines for the Study of Individual Differences in Judgment and Decision-Making Research.” Judgment and Decision Making 6 (April). https://doi.org/10.1017/s1930297500001455.

Arai, Yoichi, Taisuke Otsu, and Myung Hwan Seo. 2021. “Regression Discontinuity Design with Potentially Many Covariates.” arXiv. https://doi.org/10.48550/ARXIV.2109.08351.

Arceneaux, Kevin, Alan S. Gerber, and Donald P. Green. 2010. “A Cautionary Note on the Use of Matching to Estimate Causal Effects: An Empirical Example Comparing Matching Estimates to an Experimental Benchmark.” Sociological Methods &Amp; Research 39 (August). https://doi.org/10.1177/0049124110378098.

Aronow, Peter M., Jonathon Baron, and Lauren Pinson. 2019. “A Note on Dropping Experimental Subjects Who Fail a Manipulation Check.” Political Analysis 27 (May). https://doi.org/10.1017/pan.2019.5.

Ayres, Jay, Jason Flannick, Johannes Gehrke, and Tomi Yiu. 2002. “Sequential PAttern Mining Using a Bitmap Representation.” Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, July. https://doi.org/10.1145/775047.775109.

Bafumi, Joseph, Andrew Gelman, David K. Park, and Noah Kaplan. 2005. “Practical Issues in Implementing and Understanding Bayesian Ideal Point Estimation.” Political Analysis 13. https://doi.org/10.1093/pan/mpi010.

Bagnall, Anthony, Aaron Bostrom, James Large, and Jason Lines. 2016. “The Great Time Series Classification Bake Off: An Experimental Evaluation of Recently Proposed Algorithms. Extended Version.” arXiv. https://doi.org/10.48550/ARXIV.1602.01711.

Bansak, Kirk, Jens Hainmueller, and Teppei Yamamoto. 2017. “Beyond the Breaking Point? Survey Satisficing in Conjoint Experiments.” SSRN Electronic Journal. https://doi.org/10.2139/ssrn.2959146.

BARABAS, JASON, and JENNIFER JERIT. 2010. “Are Survey Experiments Externally Valid?” American Political Science Review 104 (May). https://doi.org/10.1017/s0003055410000092.

Bärnighausen, Till, John-Arne Røttingen, Peter Rockers, Ian Shemilt, and Peter Tugwell. 2017. “Quasi-Experimental Study Designs Series—Paper 1: Introduction: Two Historical Lineages.” Journal of Clinical Epidemiology 89 (September). https://doi.org/10.1016/j.jclinepi.2017.02.020.

Bartolucci, Francesco, Monia Lupparelli, and Giorgio E. Montanari. 2009. “Latent Markov Model for Longitudinal Binary Data: An Application to the Performance Evaluation of Nursing Homes.” The Annals of Applied Statistics 3 (June). https://doi.org/10.1214/08-aoas230.

Bayat, A. 2002. “Science, Medicine, and the Future: Bioinformatics.” BMJ 324 (April). https://doi.org/10.1136/bmj.324.7344.1018.

Becker, Brett A., Paul Denny, James Finnie-Ansley, Andrew Luxton-Reilly, James Prather, and Eddie Antonio Santos. 2023. “Programming Is Hard - or at Least It Used to Be.” Proceedings of the 54th ACM Technical Symposium on Computer Science Education V. 1, March. https://doi.org/10.1145/3545945.3569759.

Bender, Andreas, David Rügamer, Fabian Scheipl, and Bernd Bischl. 2021. “A General Machine Learning Framework for Survival Analysis.” Machine Learning and Knowledge Discovery in Databases. https://doi.org/10.1007/978-3-030-67664-3_10.

Berenguer, Abel Díaz, Tanmoy Mukherjee, Matias Bossa, Nikos Deligiannis, and Hichem Sahli. 2022. “Representation Learning with Information Theory for COVID-19 Detection.” arXiv. https://doi.org/10.48550/ARXIV.2207.01437.

Blackwell, Matthew, and Michael P. Olson. 2021. “Reducing Model Misspecification and Bias in the Estimation of Interactions.” Political Analysis 30 (July). https://doi.org/10.1017/pan.2021.19.

Blair, Graeme, Winston Chou, and Kosuke Imai. 2019. “List Experiments with Measurement Error.” Political Analysis 27 (May). https://doi.org/10.1017/pan.2018.56.

Blair, Graeme, and Kosuke Imai. 2012. “Statistical Analysis of List Experiments.” Political Analysis 20. https://doi.org/10.1093/pan/mpr048.

Blair, Graeme, Kosuke Imai, and Jason Lyall. 2014. “Comparing and Combining List and Endorsement Experiments: Evidence from Afghanistan.” American Journal of Political Science 58 (February). https://doi.org/10.1111/ajps.12086.

Blair, Graeme, Kosuke Imai, and Yang-Yang Zhou. 2015. “Design and Analysis of the Randomized Response Technique.” Journal of the American Statistical Association 110 (July). https://doi.org/10.1080/01621459.2015.1050028.

Blanchet, Juliette, and Anthony C. Davison. 2011. “Spatial Modeling of Extreme Snow Depth.” The Annals of Applied Statistics 5 (September). https://doi.org/10.1214/11-aoas464.

Bowers, Jake, and Katherine W. Drake. 2005. “EDA for HLM: Visualization When Probabilistic Inference Fails.” Political Analysis 13. https://doi.org/10.1093/pan/mpi031.

Brambor, Thomas, William Roberts Clark, and Matt Golder. 2006. “Understanding Interaction Models: Improving Empirical Analyses.” Political Analysis 14. https://doi.org/10.1093/pan/mpi014.

Brinkmann, Levin, Fabian Baumann, Jean-François Bonnefon, Maxime Derex, Thomas F. Müller, Anne-Marie Nussberger, Agnieszka Czaplicka, et al. 2023. “Machine Culture.” Nature Human Behaviour 7 (November). https://doi.org/10.1038/s41562-023-01742-2.

Brito, Luiz Fernando, John C. McEwan, Stephen Miller, Wendy Bain, Michael Lee, Ken Dodds, Sheryl-Anne Newman, Natalie Pickering, Flávio S. Schenkel, and Shannon Clarke. 2017. “Genetic Parameters for Various Growth, Carcass and Meat Quality Traits in a New Zealand Sheep Population.” Small Ruminant Research 154 (September). https://doi.org/10.1016/j.smallrumres.2017.07.011.

Brohan, Anthony, Noah Brown, Justice Carbajal, Yevgen Chebotar, Xi Chen, Krzysztof Choromanski, Tianli Ding, et al. 2023. “RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control.” arXiv. https://doi.org/10.48550/ARXIV.2307.15818.

Brown, Wesley M. 1981. “MECHANISMS OF EVOLUTION IN ANIMAL MITOCHONDRIAL DNA.” Annals of the New York Academy of Sciences 361 (February). https://doi.org/10.1111/j.1749-6632.1981.tb46515.x.

Bullock, Will, Kosuke Imai, and Jacob N. Shapiro. 2011. “Statistical Analysis of Endorsement Experiments: Measuring Support for Militant Groups in Pakistan.” Political Analysis 19. https://doi.org/10.1093/pan/mpr031.

Burke, Marshall, and Kyle Emerick. 2016. “Adaptation to Climate Change: Evidence from US Agriculture.” American Economic Journal: Economic Policy 8 (August). https://doi.org/10.1257/pol.20130025.

Butts, Kyle. 2021. “Geographic Difference-in-Discontinuities.” Applied Economics Letters 30 (November). https://doi.org/10.1080/13504851.2021.2005236.

Cao, Jialun, Meiziniu Li, Ming Wen, and Shing-chi Cheung. 2023. “A Study on Prompt Design, Advantages and Limitations of ChatGPT for Deep Learning Program Repair.” arXiv. https://doi.org/10.48550/ARXIV.2304.08191.

Carta, Thomas, Clément Romac, Thomas Wolf, Sylvain Lamprier, Olivier Sigaud, and Pierre-Yves Oudeyer. 2023. “Grounding Large Language Models in Interactive Environments with Online Reinforcement Learning.” arXiv. https://doi.org/10.48550/ARXIV.2302.02662.

Castruccio, Stefano, and Michael L. Stein. 2013. “Global Space–Time Models for Climate Ensembles.” The Annals of Applied Statistics 7 (September). https://doi.org/10.1214/13-aoas656.

Cattaneo, Matias D., Nicolás Idrobo, and Rocío Titiunik. 2019. “A Practical Introduction to Regression Discontinuity Designs,” November. https://doi.org/10.1017/9781108684606.

Caughey, Devin, and Jasjeet S. Sekhon. 2011. “Elections and the Regression Discontinuity Design: Lessons from Close u.s. House Races, 1942–2008.” Political Analysis 19. https://doi.org/10.1093/pan/mpr032.

Chan, A.-W. 2004. “Outcome Reporting Bias in Randomized Trials Funded by the Canadian Institutes of Health Research.” Canadian Medical Association Journal 171 (September). https://doi.org/10.1503/cmaj.1041086.

Chen, Irene Y., Emma Pierson, Sherri Rose, Shalmali Joshi, Kadija Ferryman, and Marzyeh Ghassemi. 2021. “Ethical Machine Learning in Healthcare.” Annual Review of Biomedical Data Science 4 (July). https://doi.org/10.1146/annurev-biodatasci-092820-114757.

Chen, Lingjiao, Matei Zaharia, and James Zou. 2023. “How Is ChatGPT’s Behavior Changing over Time?” arXiv. https://doi.org/10.48550/ARXIV.2307.09009.

Chen, Li-Wei, Shinji Watanabe, and Alexander Rudnicky. 2023. “A Vector Quantized Approach for Text to Speech Synthesis on Real-World Spontaneous Speech.” arXiv. https://doi.org/10.48550/ARXIV.2302.04215.

Chen, Tianqi, Mu Li, Yutian Li, Min Lin, Naiyan Wang, Minjie Wang, Tianjun Xiao, Bing Xu, Chiyuan Zhang, and Zheng Zhang. 2015. “MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems.” arXiv. https://doi.org/10.48550/ARXIV.1512.01274.

Choi, Edward, Mohammad Taha Bahadori, Andy Schuetz, Walter F. Stewart, and Jimeng Sun. 2015. “Doctor AI: Predicting Clinical Events via Recurrent Neural Networks.” arXiv. https://doi.org/10.48550/ARXIV.1511.05942.

Choi, Edward, Andy Schuetz, Walter F Stewart, and Jimeng Sun. 2016. “Using Recurrent Neural Network Models for Early Detection of Heart Failure Onset.” Journal of the American Medical Informatics Association 24 (August). https://doi.org/10.1093/jamia/ocw112.

Chou, Winston, Kosuke Imai, and Bryn Rosenfeld. 2017. “Sensitive Survey Questions with Auxiliary Information.” Sociological Methods &Amp; Research 49 (December). https://doi.org/10.1177/0049124117729711.

Chung, Junyoung, Caglar Gulcehre, KyungHyun Cho, and Yoshua Bengio. 2014. “Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling.” arXiv. https://doi.org/10.48550/ARXIV.1412.3555.

Cinelli, Carlos, and Chad Hazlett. 2019. “Making Sense of Sensitivity: Extending Omitted Variable Bias.” Journal of the Royal Statistical Society Series B: Statistical Methodology 82 (December). https://doi.org/10.1111/rssb.12348.

Clarke, Kevin A. 2003. “Nonparametric Model Discrimination in International Relations.” Journal of Conflict Resolution 47 (February). https://doi.org/10.1177/0022002702239512.

———. 2005. “The Phantom Menace: Omitted Variable Bias in Econometric Research.” Conflict Management and Peace Science 22 (September). https://doi.org/10.1080/07388940500339183.

———. 2007. “A Simple Distribution-Free Test for Nonnested Model Selection.” Political Analysis 15. https://doi.org/10.1093/pan/mpm004.

Clarke, Mike. 2004. “Doing New Research? Don’t Forget the Old.” PLoS Medicine 1 (November). https://doi.org/10.1371/journal.pmed.0010035.

Corstange, Daniel. 2009. “Sensitive Questions, Truthful Answers? Modeling the List Experiment with LISTIT.” Political Analysis 17. https://doi.org/10.1093/pan/mpn013.

Cuesta, Brandon de la, Naoki Egami, and Kosuke Imai. 2021. “Improving the External Validity of Conjoint Analysis: The Essential Role of Profile Distribution.” Political Analysis 30 (January). https://doi.org/10.1017/pan.2020.40.

Cui, Can, Yunsheng Ma, Xu Cao, Wenqian Ye, and Ziran Wang. 2023. “Receive, Reason, and React: Drive as You Say with Large Language Models in Autonomous Vehicles.” arXiv. https://doi.org/10.48550/ARXIV.2310.08034.

Dahlhaus, Rainer, Stefan Richter, and Wei Biao Wu. 2017. “Towards a General Theory for Non-Linear Locally Stationary Processes.” arXiv. https://doi.org/10.48550/ARXIV.1704.02860.

Dasgupta, Ishita, Christine Kaeser-Chen, Kenneth Marino, Arun Ahuja, Sheila Babayan, Felix Hill, and Rob Fergus. 2023. “Collaborating with Language Models for Embodied Reasoning.” arXiv. https://doi.org/10.48550/ARXIV.2302.00763.

Davidov, Eldad. 2009. “Measurement Equivalence of Nationalism and Constructive Patriotism in the ISSP: 34 Countries in a Comparative Perspective.” Political Analysis 17. https://doi.org/10.1093/pan/mpn014.

Decker, Christian, and Marco Ottaviani. 2023. “Preregistration and Credibility of Clinical Trials*,” May. https://doi.org/10.1101/2023.05.22.23290326.

Dehghani, Mostafa, Aliaksei Severyn, Sascha Rothe, and Jaap Kamps. 2017. “Learning to Learn from Weak Supervision by Full Supervision.” arXiv. https://doi.org/10.48550/ARXIV.1711.11383.

Deng, Yinlin, Chunqiu Steven Xia, Chenyuan Yang, Shizhuo Dylan Zhang, Shujing Yang, and Lingming Zhang. 2023. “Large Language Models Are Edge-Case Fuzzers: Testing Deep Learning Libraries via FuzzGPT.” arXiv. https://doi.org/10.48550/ARXIV.2304.02014.

Denny, Paul, Viraj Kumar, and Nasser Giacaman. 2022. “Conversing with Copilot: Exploring Prompt Engineering for Solving CS1 Problems Using Natural Language.” arXiv. https://doi.org/10.48550/ARXIV.2210.15157.

Deshpande, Manasi, and Yue Li. 2019. “Who Is Screened Out? Application Costs and the Targeting of Disability Programs.” American Economic Journal: Economic Policy 11 (November). https://doi.org/10.1257/pol.20180076.

Ding, Yan, Xiaohan Zhang, Chris Paxton, and Shiqi Zhang. 2023. “Task and Motion Planning with Large Language Models for Object Rearrangement.” arXiv. https://doi.org/10.48550/ARXIV.2303.06247.

Dong, Guozhu, and Jinyan Li. 1999. “Efficient Mining of Emerging Patterns.” Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, August. https://doi.org/10.1145/312129.312191.

Douglass, Rex W, David A Meyer, Megha Ram, David Rideout, and Dongjin Song. 2015. “High Resolution Population Estimates from Telecommunications Data.” EPJ Data Science 4 (May). https://doi.org/10.1140/epjds/s13688-015-0040-6.

Drain, Dawn, Colin B. Clement, Guillermo Serrato, and Neel Sundaresan. 2021. “DeepDebug: Fixing Python Bugs Using Stack Traces, Backtranslation, and Code Skeletons.” arXiv. https://doi.org/10.48550/ARXIV.2105.09352.

Driess, Danny, Fei Xia, Mehdi S. M. Sajjadi, Corey Lynch, Aakanksha Chowdhery, Brian Ichter, Ayzaan Wahid, et al. 2023. “PaLM-e: An Embodied Multimodal Language Model.” arXiv. https://doi.org/10.48550/ARXIV.2303.03378.

Du, Hongyang, Zonghang Li, Dusit Niyato, Jiawen Kang, Zehui Xiong, Xuemin, and Dong In Kim. 2023. “Enabling AI-Generated Content (AIGC) Services in Wireless Edge Networks.” arXiv. https://doi.org/10.48550/ARXIV.2301.03220.

Du, Yilun, Mengjiao Yang, Pete Florence, Fei Xia, Ayzaan Wahid, Brian Ichter, Pierre Sermanet, et al. 2023. “Video Language Planning.” arXiv. https://doi.org/10.48550/ARXIV.2310.10625.

Du, Yuqing, Ksenia Konyushkova, Misha Denil, Akhil Raju, Jessica Landon, Felix Hill, Nando de Freitas, and Serkan Cabi. 2023. “Vision-Language Models as Success Detectors.” arXiv. https://doi.org/10.48550/ARXIV.2303.07280.

Du, Yuqing, Olivia Watkins, Zihan Wang, Cédric Colas, Trevor Darrell, Pieter Abbeel, Abhishek Gupta, and Jacob Andreas. 2023. “Guiding Pretraining in Reinforcement Learning with Large Language Models.” arXiv. https://doi.org/10.48550/ARXIV.2302.06692.

Dubey, Abhimanyu, Nikhil Naik, Devi Parikh, Ramesh Raskar, and César A. Hidalgo. 2016. “Deep Learning the City : Quantifying Urban Perception at a Global Scale.” arXiv. https://doi.org/10.48550/ARXIV.1608.01769.

Dunning, Thad. 2008. “Model Specification in Instrumental-Variables Regression.” Political Analysis 16. https://doi.org/10.1093/pan/mpm039.

Dwan, Kerry, Douglas G. Altman, Juan A. Arnaiz, Jill Bloom, An-Wen Chan, Eugenia Cronin, Evelyne Decullier, et al. 2008. “Systematic Review of the Empirical Evidence of Study Publication Bias and Outcome Reporting Bias.” PLoS ONE 3 (August). https://doi.org/10.1371/journal.pone.0003081.

Dwan, Kerry, Douglas G. Altman, Mike Clarke, Carrol Gamble, Julian P. T. Higgins, Jonathan A. C. Sterne, Paula R. Williamson, and Jamie J. Kirkham. 2014. “Evidence for the Selective Reporting of Analyses and Discrepancies in Clinical Trials: A Systematic Review of Cohort Studies of Clinical Trials.” PLoS Medicine 11 (June). https://doi.org/10.1371/journal.pmed.1001666.

Dwan, Kerry, Carrol Gamble, Paula R. Williamson, and Jamie J. Kirkham. 2013. “Systematic Review of the Empirical Evidence of Study Publication Bias and Outcome Reporting Bias — an Updated Review.” PLoS ONE 8 (July). https://doi.org/10.1371/journal.pone.0066844.

Eckles, Dean, Nikolaos Ignatiadis, Stefan Wager, and Han Wu. 2020. “Noise-Induced Randomization in Regression Discontinuity Designs.” arXiv. https://doi.org/10.48550/ARXIV.2004.09458.

EGAMI, NAOKI, and ERIN HARTMAN. 2022. “Elements of External Validity: Framework, Design, and Analysis.” American Political Science Review 117 (October). https://doi.org/10.1017/s0003055422000880.

Egami, Naoki, and Kosuke Imai. 2018. “Causal Interaction in Factorial Experiments: Application to Conjoint Analysis.” Journal of the American Statistical Association 114 (August). https://doi.org/10.1080/01621459.2018.1476246.

ENAMORADO, TED, BENJAMIN FIFIELD, and KOSUKE IMAI. 2019. “Using a Probabilistic Model to Assist Merging of Large-Scale Administrative Records.” American Political Science Review 113 (January). https://doi.org/10.1017/s0003055418000783.

Enamorado, Ted, and Kosuke Imai. 2019. “Validating Self-Reported Turnout by Linking Public Opinion Surveys with Administrative Records.” Public Opinion Quarterly 83. https://doi.org/10.1093/poq/nfz051.

Fang, Meng, Shilong Deng, Yudi Zhang, Zijing Shi, Ling Chen, Mykola Pechenizkiy, and Jun Wang. 2024. “Large Language Models Are Neurosymbolic Reasoners.” arXiv. https://doi.org/10.48550/ARXIV.2401.09334.

Fellegi, Ivan P., and Alan B. Sunter. 1969. “A Theory for Record Linkage.” Journal of the American Statistical Association 64 (December). https://doi.org/10.1080/01621459.1969.10501049.

Fifield, Benjamin, Michael Higgins, Kosuke Imai, and Alexander Tarr. 2020. “Automated Redistricting Simulation Using Markov Chain Monte Carlo.” Journal of Computational and Graphical Statistics 29 (May). https://doi.org/10.1080/10618600.2020.1739532.

Findlay, Robert H., Gary M. King, and Les Watling. 1989. “Efficacy of Phospholipid Analysis in Determining Microbial Biomass in Sediments.” Applied and Environmental Microbiology 55 (November). https://doi.org/10.1128/aem.55.11.2888-2893.1989.

Flanagin, Annette, Kirsten Bibbins-Domingo, Michael Berkwits, and Stacy L. Christiansen. 2023. “Nonhuman ‘Authors’ and Implications for the Integrity of Scientific Publication and Medical Knowledge.” JAMA 329 (February). https://doi.org/10.1001/jama.2023.1344.

Fowler, Anthony, and Andrew B. Hall. 2018. “Do Shark Attacks Influence Presidential Elections? Reassessing a Prominent Finding on Voter Competence.” The Journal of Politics 80 (October). https://doi.org/10.1086/699244.

Fu, Anqi, Balasubramanian Narasimhan, and Stephen Boyd. 2020. “CVXR: An r Package for Disciplined Convex Optimization.” Journal of Statistical Software 94. https://doi.org/10.18637/jss.v094.i14.

Gaynor, Martin, Rodrigo Moreno-Serra, and Carol Propper. 2013. “Death by Market Power: Reform, Competition, and Patient Outcomes in the National Health Service.” American Economic Journal: Economic Policy 5 (November). https://doi.org/10.1257/pol.5.4.134.

Gelman, Andrew. 2007a. “Struggles with Survey Weighting and Regression Modeling.” Statistical Science 22 (May). https://doi.org/10.1214/088342306000000691.

———. 2007b. “Rich State, Poor State, Red State, Blue State: What’s the Matter with Connecticut?” Quarterly Journal of Political Science 2 (November). https://doi.org/10.1561/100.00006026.

Gelman, Andrew, and Julia Azari. 2017. “19 Things We Learned from the 2016 Election.” Statistics and Public Policy 4 (January). https://doi.org/10.1080/2330443x.2017.1356775.

Gelman, Andrew, Jessica Hullman, Christopher Wlezien, and George Elliott Morris. 2020. “Information, Incentives, and Goals in Election Forecasts.” Judgment and Decision Making 15 (September). https://doi.org/10.1017/s1930297500007981.

Gelman, Andrew, and Guido Imbens. 2018. “Why High-Order Polynomials Should Not Be Used in Regression Discontinuity Designs.” Journal of Business &Amp; Economic Statistics 37 (May). https://doi.org/10.1080/07350015.2017.1366909.

Gelman, Andrew, and Gary King. 1990. “Estimating Incumbency Advantage Without Bias.” American Journal of Political Science 34 (November). https://doi.org/10.2307/2111475.

———. 1994. “Enhancing Democracy Through Legislative Redistricting.” American Political Science Review 88 (September). https://doi.org/10.2307/2944794.

Gelman, Andrew, and David K. Park. 2009. “Splitting a Predictor at the Upper Quarter or Third and the Lower Quarter or Third.” The American Statistician 63 (February). https://doi.org/10.1198/tast.2009.0001.

Gelman, Andrew, Boris Shor, Joseph Bafumi, and David Park. 2005. “Rich State, Poor State, Red State, Blue State: What’s the Matter with Connecticut?” SSRN Electronic Journal. https://doi.org/10.2139/ssrn.1010426.

Geloven, Nan van, Sonja A. Swanson, Chava L. Ramspek, Kim Luijken, Merel van Diepen, Tim P. Morris, Rolf H. H. Groenwold, Hans C. van Houwelingen, Hein Putter, and Saskia le Cessie. 2020. “Prediction Meets Causal Inference: The Role of Treatment in Clinical Prediction Models.” European Journal of Epidemiology 35 (May). https://doi.org/10.1007/s10654-020-00636-1.

Ghitza, Yair, and Andrew Gelman. 2013. “Deep Interactions with MRP: Election Turnout and Voting Patterns Among Small Electoral Subgroups.” American Journal of Political Science 57 (February). https://doi.org/10.1111/ajps.12004.

Glynn, Adam N., and Nahomi Ichino. 2014. “Using Qualitative Information to Improve Causal Inference.” American Journal of Political Science 59 (December). https://doi.org/10.1111/ajps.12154.

Gräßle, Tobias, Catherine Crockford, Cornelius Eichner, Cédric Girard‐Buttoz, Carsten Jäger, Evgeniya Kirilina, Ilona Lipp, et al. 2023. “Sourcing High Tissue Quality Brains from Deceased Wild Primates with Known Socio‐ecology.” Methods in Ecology and Evolution 14 (January). https://doi.org/10.1111/2041-210x.14039.

Grofman, Bernard, and Gary King. 2007. “The Future of Partisan Symmetry as a Judicial Test for Partisan Gerrymandering After LULAC v. Perry.” Election Law Journal: Rules, Politics, and Policy 6 (March). https://doi.org/10.1089/elj.2006.6002.

Hainmueller, Jens, Daniel J. Hopkins, and Teppei Yamamoto. 2014. “Causal Inference in Conjoint Analysis: Understanding Multidimensional Choices via Stated Preference Experiments.” Political Analysis 22. https://doi.org/10.1093/pan/mpt024.

Hainmueller, Jens, Jonathan Mummolo, and Yiqing Xu. 2018. “How Much Should We Trust Estimates from Multiplicative Interaction Models? Simple Tools to Improve Empirical Practice.” Political Analysis 27 (December). https://doi.org/10.1017/pan.2018.46.

Hallsworth, John E., Zulema Udaondo, Carlos Pedrós‐Alió, Juan Höfer, Kathleen C. Benison, Karen G. Lloyd, Radamés J. B. Cordero, Claudia B. L. de Campos, Michail M. Yakimov, and Ricardo Amils. 2023. “Scientific Novelty Beyond the Experiment.” Microbial Biotechnology 16 (February). https://doi.org/10.1111/1751-7915.14222.

Halterman, Andrew. 2019. “Geolocating Political Events in Text.” Proceedings of the Third Workshop on Natural Language Processing and Computational Social Science. https://doi.org/10.18653/v1/w19-2104.

Hanmer, Michael J., and Kerem Ozan Kalkan. 2012. “Behind the Curve: Clarifying the Best Approach to Calculating Predicted Probabilities and Marginal Effects from Limited Dependent Variable Models.” American Journal of Political Science 57 (July). https://doi.org/10.1111/j.1540-5907.2012.00602.x.

“Harvard Data Science Review.” n.d. https://doi.org/10.1162/99608f92.

Hazra, Arnab, and Raphaël Huser. 2021. “Estimating High-Resolution Red Sea Surface Temperature Hotspots, Using a Low-Rank Semiparametric Spatial Model.” The Annals of Applied Statistics 15 (June). https://doi.org/10.1214/20-aoas1418.

Heagerty, Patrick J., Thomas Lumley, and Margaret S. Pepe. 2000. “Time‐dependent ROC Curves for Censored Survival Data and a Diagnostic Marker.” Biometrics 56 (June). https://doi.org/10.1111/j.0006-341x.2000.00337.x.

Heersink, Boris, Brenton D. Peterson, and Jeffery A. Jenkins. 2017. “Disasters and Elections: Estimating the Net Effect of Damage and Relief in Historical Perspective.” Political Analysis 25 (April). https://doi.org/10.1017/pan.2017.7.

Heidemanns, Merlin, Andrew Gelman, and G. Elliott Morris. 2020. “An Updated Dynamic Bayesian Forecasting Model for the US Presidential Election.” 2.4 2 (October). https://doi.org/10.1162/99608f92.fc62f1e1.

Herron, Michael C. 1999. “Postestimation Uncertainty in Limited Dependent Variable Models.” Political Analysis 8. https://doi.org/10.1093/oxfordjournals.pan.a029806.

Hill, Seth J., and Margaret E. Roberts. 2023. “Acquiescence Bias Inflates Estimates of Conspiratorial Beliefs and Political Misperceptions.” Political Analysis 31 (January). https://doi.org/10.1017/pan.2022.28.

Hinne, Max, Quentin F. Gronau, Don van den Bergh, and Eric-Jan Wagenmakers. 2020. “A Conceptual Introduction to Bayesian Model Averaging.” Advances in Methods and Practices in Psychological Science 3 (June). https://doi.org/10.1177/2515245919898657.

Ho, Daniel E., and Kosuke Imai. 2008. “Estimating Causal Effects of Ballot Order from a Randomized Natural Experiment.” Public Opinion Quarterly 72. https://doi.org/10.1093/poq/nfn018.

Ho, Daniel E, and Kosuke Imai. 2006. “Randomization Inference with Natural Experiments.” Journal of the American Statistical Association 101 (September). https://doi.org/10.1198/016214505000001258.

Hollmann, Noah, Samuel Müller, and Frank Hutter. 2023. “Large Language Models for Automated Data Science: Introducing CAAFE for Context-Aware Automated Feature Engineering.” arXiv. https://doi.org/10.48550/ARXIV.2305.03403.

Hopkins, D. J., and G. King. 2010. “Improving Anchoring Vignettes: Designing Surveys to Correct Interpersonal Incomparability.” Public Opinion Quarterly 74 (March). https://doi.org/10.1093/poq/nfq011.

Horiuchi, Yusaku, Kosuke Imai, and Naoko Taniguchi. 2007. “Designing and Analyzing Randomized Experiments: Application to a Japanese Election Survey Experiment.” American Journal of Political Science 51 (June). https://doi.org/10.1111/j.1540-5907.2007.00274.x.

Hu, Yan, Qingyu Chen, Jingcheng Du, Xueqing Peng, Vipina Kuttichi Keloth, Xu Zuo, Yujia Zhou, et al. 2023. “Improving Large Language Models for Clinical Named Entity Recognition via Prompt Engineering.” arXiv. https://doi.org/10.48550/ARXIV.2303.16416.

Hua, William, Hongyuan Mei, Sarah Zohar, Magali Giral, and Yanxun Xu. 2022. “Personalized Dynamic Treatment Regimes in Continuous Time: A Bayesian Approach for Optimizing Clinical Decisions with Timing.” Bayesian Analysis 17 (September). https://doi.org/10.1214/21-ba1276.

Huang, Chenguang, Oier Mees, Andy Zeng, and Wolfram Burgard. 2022. “Visual Language Maps for Robot Navigation.” arXiv. https://doi.org/10.48550/ARXIV.2210.05714.

———. 2023. “Audio Visual Language Maps for Robot Navigation.” arXiv. https://doi.org/10.48550/ARXIV.2303.07522.

Huang, Tao, Zhihong Sun, Zhi Jin, Ge Li, and Chen Lyu. 2024. “Knowledge-Aware Code Generation with Large Language Models.” arXiv. https://doi.org/10.48550/ARXIV.2401.15940.

Huang, Wenlong, Pieter Abbeel, Deepak Pathak, and Igor Mordatch. 2022. “Language Models as Zero-Shot Planners: Extracting Actionable Knowledge for Embodied Agents.” arXiv. https://doi.org/10.48550/ARXIV.2201.07207.

Huang, Wenlong, Fei Xia, Dhruv Shah, Danny Driess, Andy Zeng, Yao Lu, Pete Florence, et al. 2023. “Grounded Decoding: Guiding Text Generation with Grounded Models for Embodied Agents.” arXiv. https://doi.org/10.48550/ARXIV.2303.00855.

Iglesias, T. L., R. McElreath, and G. L. Patricelli. 2012. “Western Scrub-Jay Funerals: Cacophonous Aggregations in Response to Dead Conspecifics.” Animal Behaviour 84 (November). https://doi.org/10.1016/j.anbehav.2012.08.007.

Imai, Kosuke. 2011. “Multivariate Regression Analysis for the Item Count Technique.” Journal of the American Statistical Association 106 (June). https://doi.org/10.1198/jasa.2011.ap10415.

Imai, Kosuke, Zhichao Jiang, and Anup Malani. 2020. “Causal Inference with Interference and Noncompliance in Two-Stage Randomized Experiments.” Journal of the American Statistical Association 116 (July). https://doi.org/10.1080/01621459.2020.1775612.

Imai, Kosuke, and Kabir Khanna. 2016. “Improving Ecological Inference by Predicting Individual Ethnicity from Voter Registration Records.” Political Analysis 24. https://doi.org/10.1093/pan/mpw001.

Imai, Kosuke, and Gary King. 2004. “Did Illegal Overseas Absentee Ballots Decide the 2000 u.s. Presidential Election?” Perspectives on Politics 2 (September). https://doi.org/10.1017/s1537592704040332.

Imai, Kosuke, Gary King, and Carlos Velasco Rivera. 2019. “Replication Data for: "Do Nonpartisan Programmatic Policies Generate Partisan Electoral Effects? Evidence from Two Large Scale Experiments".” https://doi.org/10.7910/DVN/70SNIS.

IMAI, KOSUKE, JAMES LO, and JONATHAN OLMSTED. 2016. “Fast Estimation of Ideal Points with Massive Data.” American Political Science Review 110 (November). https://doi.org/10.1017/s000305541600037x.

Imai, Kosuke, Ying Lu, and Aaron Strauss. 2007. “Bayesian and Likelihood Inference for 2 × 2 Ecological Tables: An Incomplete-Data Approach.” Political Analysis 16 (August). https://doi.org/10.1093/pan/mpm017.

Imai, Kosuke, and Yang Ning. 2023. “Covariate Balancing Propensity Score.” Handbook of Matching and Weighting Adjustments for Causal Inference, March. https://doi.org/10.1201/9781003102670-15.

Imai, Kosuke, Bethany Park, and Kenneth F. Greene. 2015. “Using the Predicted Responses from List Experiments as Explanatory Variables in Regression Models.” Political Analysis 23. https://doi.org/10.1093/pan/mpu017.

Imai, Kosuke, and Aaron Strauss. 2011. “Estimation of Heterogeneous Treatment Effects from Randomized Experiments, with Application to the Optimal Planning of the Get-Out-the-Vote Campaign.” Political Analysis 19. https://doi.org/10.1093/pan/mpq035.

Imai, Kosuke, and Teppei Yamamoto. 2010. “Replication Data for: Causal Inference with Differential Measurement Error: Nonparametric Identification and Sensitivity Analysis.” https://doi.org/10.7910/DVN/TZOGL9.

Imbens, Guido, and Thomas Lemieux. 2007. “Regression Discontinuity Designs: A Guide to Practice,” April. https://doi.org/10.3386/w13039.

INCERTI, TREVOR. 2020. “Corruption Information and Vote Share: A Meta-Analysis and Lessons for Experimental Design.” American Political Science Review 114 (June). https://doi.org/10.1017/s000305542000012x.

Ioannidis, John P. A. 2016. “Why Most Clinical Research Is Not Useful.” PLOS Medicine 13 (June). https://doi.org/10.1371/journal.pmed.1002049.

Ivashchenko, Tetiana, Andrii Ivashchenko, and Nelia Vasylets. 2023. “THE WAYS OF INTRODUCING AI/ML-BASED PREDICTION METHODS FOR THE IMPROVEMENT OF THE SYSTEM OF GOVERNMENT SOCIO-ECONOMIC ADMINISTRATION IN UKRAINE.” Business: Theory and Practice 24 (November). https://doi.org/10.3846/btp.2023.18733.

Jackman, Simon. 2001. “Multidimensional Analysis of Roll Call Data via Bayesian Simulation: Identification, Estimation, Inference, and Model Checking.” Political Analysis 9 (January). https://doi.org/10.1093/polana/9.3.227.

Jensen, Tina Birk, Anders Ringgaard Kristensen, Nils Toft, Niels Peter Baadsgaard, Søren Østergaard, and Hans Houe. 2009. “An Object-Oriented Bayesian Network Modeling the Causes of Leg Disorders in Finisher Herds.” Preventive Veterinary Medicine 89 (June). https://doi.org/10.1016/j.prevetmed.2009.02.009.

Jiang, Nan, Kevin Liu, Thibaud Lutellier, and Lin Tan. 2023. “Impact of Code Language Models on Automated Program Repair.” arXiv. https://doi.org/10.48550/ARXIV.2302.05020.

Jochem, Warren C., and Andrew J. Tatem. 2021. “Tools for Mapping Multi-Scale Settlement Patterns of Building Footprints: An Introduction to the r Package Foot.” PLOS ONE 16 (February). https://doi.org/10.1371/journal.pone.0247535.

Jombart, Thibaut, and Ismaïl Ahmed. 2011. “adegenet 1.3-1: New Tools for the Analysis of Genome-Wide SNP Data.” Bioinformatics 27 (September). https://doi.org/10.1093/bioinformatics/btr521.

Kammoun, Amina, Rim Slama, Hedi Tabia, Tarek Ouni, and Mohmed Abid. 2022. “Generative Adversarial Networks for Face Generation: A Survey.” ACM Computing Surveys, March. https://doi.org/10.1145/1122445.1122456.

Kang, Hyunseung, Yang Jiang, Qingyuan Zhao, and Dylan S. Small. 2020. “Ivmodel: An r Package for Inference and Sensitivity Analysis of Instrumental Variables Models with One Endogenous Variable.” arXiv. https://doi.org/10.48550/ARXIV.2002.08457.

Kao, Edward K. 2017. “Causal Inference Under Network Interference: A Framework for Experiments on Social Networks.” arXiv. https://doi.org/10.48550/ARXIV.1708.08522.

Karamcheti, Siddharth, Suraj Nair, Annie S. Chen, Thomas Kollar, Chelsea Finn, Dorsa Sadigh, and Percy Liang. 2023. “Language-Driven Representation Learning for Robotics.” arXiv. https://doi.org/10.48550/ARXIV.2302.12766.

Katz, Jonathan N., and Gary King. 1999. “A Statistical Model for Multiparty Electoral Data.” American Political Science Review 93 (March). https://doi.org/10.2307/2585758.

Kazemitabaar, Majeed, Justin Chow, Carl Ka To Ma, Barbara J. Ericson, David Weintrop, and Tovi Grossman. 2023. “Studying the Effect of AI Code Generators on Supporting Novice Learners in Introductory Programming.” Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems, April. https://doi.org/10.1145/3544548.3580919.

Keele, Luke. 2015. “The Statistics of Causal Inference: A View from Political Methodology.” Political Analysis 23. https://doi.org/10.1093/pan/mpv007.

Keele, Luke J., and Rocío Titiunik. 2015. “Geographic Boundaries as Regression Discontinuities.” Political Analysis 23. https://doi.org/10.1093/pan/mpu014.

Keele, Luke, Randolph T. Stevenson, and Felix Elwert. 2019. “The Causal Interpretation of Estimated Associations in Regression Models.” Political Science Research and Methods 8 (July). https://doi.org/10.1017/psrm.2019.31.

Keele, Luke, and Rocío Titiunik. 2015. “Natural Experiments Based on Geography.” Political Science Research and Methods 4 (April). https://doi.org/10.1017/psrm.2015.4.

Keet, C. Maria. 2020. “The African Wildlife Ontology Tutorial Ontologies.” Journal of Biomedical Semantics 11 (June). https://doi.org/10.1186/s13326-020-00224-y.

Killick, Rebecca, and Idris A. Eckley. 2014. “changepoint: Anrpackage for Changepoint Analysis.” Journal of Statistical Software 58. https://doi.org/10.18637/jss.v058.i03.

Kim, Geon Woo, Ju-Pyo Hong, Hea-Young Lee, Jin-Kyung Kwon, Dong-Am Kim, and Byoung-Cheorl Kang. 2022. “Genomic Selection with Fixed-Effect Markers Improves the Prediction Accuracy for Capsaicinoid Contents in capsicum Annuum.” Horticulture Research 9. https://doi.org/10.1093/hr/uhac204.

Kim, Geunwoo, Pierre Baldi, and Stephen McAleer. 2023. “Language Models Can Solve Computer Tasks.” arXiv. https://doi.org/10.48550/ARXIV.2303.17491.

Kim, Seung-Jean, Kwangmoo Koh, Stephen Boyd, and Dimitry Gorinevsky. 2009. “\(\ell_1\) Trend Filtering.” SIAM Review 51 (May). https://doi.org/10.1137/070690274.

King, G M. 1994. “Associations of Methanotrophs with the Roots and Rhizomes of Aquatic Vegetation.” Applied and Environmental Microbiology 60 (September). https://doi.org/10.1128/aem.60.9.3220-3227.1994.

King, Gary. 1986. “How Not to Lie with Statistics: Avoiding Common Mistakes in Quantitative Political Science.” American Journal of Political Science 30 (August). https://doi.org/10.2307/2111095.

———. 1988. “Statistical Models for Political Science Event Counts: Bias in Conventional Procedures and Evidence for the Exponential Poisson Regression Model.” American Journal of Political Science 32 (August). https://doi.org/10.2307/2111248.

———. 1989a. “Event Count Models for International Relations: Generalizations and Applications.” International Studies Quarterly 33 (June). https://doi.org/10.2307/2600534.

———. 1989b. “Variance Specification in Event Count Models: From Restrictive Assumptions to a Generalized Estimator.” American Journal of Political Science 33 (August). https://doi.org/10.2307/2111071.

———. 1991a. “Constituency Service and Incumbency Advantage.” British Journal of Political Science 21 (January). https://doi.org/10.1017/s0007123400006062.

———. 1991b. “"Truth" Is Stranger Than Prediction, More Questionable Than Causal Inference.” American Journal of Political Science 35 (November). https://doi.org/10.2307/2111506.

King, Gary M. 1984. “Metabolism of Trimethylamine, Choline, and Glycine Betaine by Sulfate-Reducing and Methanogenic Bacteria in Marine Sediments.” Applied and Environmental Microbiology 48 (October). https://doi.org/10.1128/aem.48.4.719-725.1984.

King, Gary M., Craig Judd, Cheryl R. Kuske, and Conor Smith. 2012. “Analysis of Stomach and Gut Microbiomes of the Eastern Oyster (Crassostrea Virginica) from Coastal Louisiana, USA.” PLoS ONE 7 (December). https://doi.org/10.1371/journal.pone.0051475.

King, Gary M., M. J. Klug, and D. R. Lovley. 1983. “Metabolism of Acetate, Methanol, and Methylated Amines in Intertidal Sediments of Lowes Cove, Maine.” Applied and Environmental Microbiology 45 (June). https://doi.org/10.1128/aem.45.6.1848-1853.1983.

King, Gary M., Peter Roslev, and Henrik Skovgaard. 1990. “Distribution and Rate of Methane Oxidation in Sediments of the Florida Everglades.” Applied and Environmental Microbiology 56 (September). https://doi.org/10.1128/aem.56.9.2902-2911.1990.

King, Gary M., and Sylvia Schnell. 1994. “Ammonium and Nitrite Inhibition of Methane Oxidation by methylobacter Albus BG8 and methylosinus Trichosporium OB3b at Low Methane Concentrations.” Applied and Environmental Microbiology 60 (October). https://doi.org/10.1128/aem.60.10.3508-3513.1994.

King, Gary, and Robert X Browning. 1987. “Democratic Representation and Partisan Bias in Congressional Elections.” American Political Science Review 81 (December). https://doi.org/10.2307/1962588.

King, Gary, Emmanuela Gakidou, Nirmala Ravishankar, Ryan T. Moore, Jason Lakin, Manett Vargas, Martha María Téllez‐Rojo, Juan Eugenio Hernández Ávila, Mauricio Hernández Ávila, and Héctor Hernández Llamas. 2007. “A ‘Politically Robust’ Experimental Design for Public Policy Evaluation, with Application to the Mexican Universal Health Insurance Program.” Journal of Policy Analysis and Management 26 (May). https://doi.org/10.1002/pam.20279.

King, Gary, and Andrew Gelman. 1991. “Systemic Consequences of Incumbency Advantage in u.s. House Elections.” American Journal of Political Science 35 (February). https://doi.org/10.2307/2111440.

KING, GARY, CHRISTOPHER J. L. MURRAY, JOSHUA A. SALOMON, and AJAY TANDON. 2004. “Enhancing the Validity and Cross-Cultural Comparability of Measurement in Survey Research.” American Political Science Review 98 (February). https://doi.org/10.1017/s000305540400108x.

King, Gary, and Margaret E. Roberts. 2015. “How Robust Standard Errors Expose Methodological Problems They Do Not Fix, and What to Do about It.” Political Analysis 23. https://doi.org/10.1093/pan/mpu015.

King, Gary, Michael Tomz, and Jason Wittenberg. 2000. “Making the Most of Statistical Analyses: Improving Interpretation and Presentation.” American Journal of Political Science 44 (April). https://doi.org/10.2307/2669316.

King, Gary, and Jonathan Wand. 2007. “Comparing Incomparable Survey Responses: Evaluating and Selecting Anchoring Vignettes.” Political Analysis 15. https://doi.org/10.1093/pan/mpl011.

King, Gary, and Langche Zeng. 2001a. “Explaining Rare Events in International Relations.” International Organization 55. https://doi.org/10.1162/00208180152507597.

———. 2001b. “Logistic Regression in Rare Events Data.” Political Analysis 9. https://doi.org/10.1093/oxfordjournals.pan.a004868.

Knight, Richard W. 1984. “Introduction to a New Sea-Ice Database.” Annals of Glaciology 5. https://doi.org/10.3189/1984aog5-1-81-84.

Knott, Alistair, Dino Pedreschi, Raja Chatila, Tapabrata Chakraborti, Susan Leavy, Ricardo Baeza-Yates, David Eyers, et al. 2023. “Generative AI Models Should Include Detection Mechanisms as a Condition for Public Release.” Ethics and Information Technology 25 (October). https://doi.org/10.1007/s10676-023-09728-4.

Kovářík, Jaromír, Dan Levin, and Tao Wang. 2016. “Ellsberg Paradox: Ambiguity and Complexity Aversions Compared.” Journal of Risk and Uncertainty 52 (February). https://doi.org/10.1007/s11166-016-9232-0.

Kreuzberger, Dominik, Niklas Kühl, and Sebastian Hirschl. 2022. “Machine Learning Operations (MLOps): Overview, Definition, and Architecture.” arXiv. https://doi.org/10.48550/ARXIV.2205.02302.

Kwon, Minae, Sang Michael Xie, Kalesha Bullard, and Dorsa Sadigh. 2023. “Reward Design with Language Models.” arXiv. https://doi.org/10.48550/ARXIV.2303.00001.

Lal, Apoorva, Mac Lockhart, Yiqing Xu, and Ziwen Zu. 2023. “How Much Should We Trust Instrumental Variable Estimates in Political Science? Practical Advice Based on over 60 Replicated Studies.” arXiv. https://doi.org/10.48550/ARXIV.2303.11399.

Lau, Ada, and Patrick McSharry. 2010. “Approaches for Multi-Step Density Forecasts with Application to Aggregated Wind Power.” The Annals of Applied Statistics 4 (September). https://doi.org/10.1214/09-aoas320.

Leeper, Thomas J., Sara B. Hobolt, and James Tilley. 2019. “Measuring Subgroup Preferences in Conjoint Experiments.” Political Analysis 28 (August). https://doi.org/10.1017/pan.2019.30.

Leinonen, Juho, Paul Denny, Stephen MacNeil, Sami Sarsa, Seth Bernstein, Joanne Kim, Andrew Tran, and Arto Hellas. 2023. “Comparing Code Explanations Created by Students and Large Language Models.” arXiv. https://doi.org/10.48550/ARXIV.2304.03938.

Leinonen, Juho, Arto Hellas, Sami Sarsa, Brent Reeves, Paul Denny, James Prather, and Brett A. Becker. 2023. “Using Large Language Models to Enhance Programming Error Messages.” Proceedings of the 54th ACM Technical Symposium on Computer Science Education V. 1, March. https://doi.org/10.1145/3545945.3569770.

Letunic, Ivica, and Peer Bork. 2016. “Interactive Tree of Life (iTOL) V3: An Online Tool for the Display and Annotation of Phylogenetic and Other Trees.” Nucleic Acids Research 44 (April). https://doi.org/10.1093/nar/gkw290.

———. 2021. “Interactive Tree of Life (iTOL) V5: An Online Tool for Phylogenetic Tree Display and Annotation.” Nucleic Acids Research 49 (April). https://doi.org/10.1093/nar/gkab301.

Lewis, Jeffrey B., and Gary King. 1999. “No Evidence on Directional Vs. Proximity Voting.” Political Analysis 8. https://doi.org/10.1093/oxfordjournals.pan.a029803.

Leyk, Stefan, Andrea E. Gaughan, Susana B. Adamo, Alex de Sherbinin, Deborah Balk, Sergio Freire, Amy Rose, et al. 2019. “The Spatial Allocation of Population: A Review of Large-Scale Gridded Population Data Products and Their Fitness for Use.” Earth System Science Data 11 (September). https://doi.org/10.5194/essd-11-1385-2019.

Li, Jia, Yongmin Li, Ge Li, Zhi Jin, Yiyang Hao, and Xing Hu. 2023. “SkCoder: A Sketch-Based Approach for Automatic Code Generation.” arXiv. https://doi.org/10.48550/ARXIV.2302.06144.

Liang, Jacky, Wenlong Huang, Fei Xia, Peng Xu, Karol Hausman, Brian Ichter, Pete Florence, and Andy Zeng. 2022. “Code as Policies: Language Model Programs for Embodied Control.” arXiv. https://doi.org/10.48550/ARXIV.2209.07753.

Lin, Kevin, Christopher Agia, Toki Migimatsu, Marco Pavone, and Jeannette Bohg. 2023. “Text2Motion: From Natural Language Instructions to Feasible Plans.” Autonomous Robots 47 (November). https://doi.org/10.1007/s10514-023-10131-7.

Lipton, Zachary C., David C. Kale, Charles Elkan, and Randall Wetzel. 2015. “Learning to Diagnose with LSTM Recurrent Neural Networks.” arXiv. https://doi.org/10.48550/ARXIV.1511.03677.

Liu, Bo, Yuqian Jiang, Xiaohan Zhang, Qiang Liu, Shiqi Zhang, Joydeep Biswas, and Peter Stone. 2023. “LLM+p: Empowering Large Language Models with Optimal Planning Proficiency.” arXiv. https://doi.org/10.48550/ARXIV.2304.11477.

Liu, Jiawei, Chunqiu Steven Xia, Yuyao Wang, and Lingming Zhang. 2023. “Is Your Code Generated by ChatGPT Really Correct? Rigorous Evaluation of Large Language Models for Code Generation.” arXiv. https://doi.org/10.48550/ARXIV.2305.01210.

Liu, Siru, Aileen P. Wright, Barron L. Patterson, Jonathan P. Wanderer, Robert W. Turer, Scott D. Nelson, Allison B. McCoy, Dean F. Sittig, and Adam Wright. 2023. “Assessing the Value of ChatGPT for Clinical Decision Support Optimization,” February. https://doi.org/10.1101/2023.02.21.23286254.

Lütkepohl, Helmut, and Fang Xu. 2010. “The Role of the Log Transformation in Forecasting Economic Variables.” Empirical Economics 42 (December). https://doi.org/10.1007/s00181-010-0440-1.

Lybarger, Kevin, Meliha Yetisgen, and Özlem Uzuner. 2023. “The 2022 N2c2/UW Shared Task on Extracting Social Determinants of Health.” Journal of the American Medical Informatics Association 30 (April). https://doi.org/10.1093/jamia/ocad012.

MacNeil, Stephen, Andrew Tran, Arto Hellas, Joanne Kim, Sami Sarsa, Paul Denny, Seth Bernstein, and Juho Leinonen. 2022. “Experiences from Using Code Explanations Generated by Large Language Models in a Web Software Development e-Book.” arXiv. https://doi.org/10.48550/ARXIV.2211.02265.

Madigan, D., E. Einahrawy, R. P. Martin, Wen-Hua Ju, P. Krishnan, and A. S. Krishnakumar. n.d. “Bayesian Indoor Positioning Systems.” Proceedings IEEE 24th Annual Joint Conference of the IEEE Computer and Communications Societies. https://doi.org/10.1109/infcom.2005.1498348.

Mäkela, Satu-Marja, Arttu Lämsä, Janne S. Keränen, Jussi Liikka, Jussi Ronkainen, Johannes Peltola, Juha Häikiö, Sari Järvinen, and Miguel Bordallo López. 2021. “Introducing VTT-ConIot: A Realistic Dataset for Activity Recognition of Construction Workers Using IMU Devices.” Sustainability 14 (December). https://doi.org/10.3390/su14010220.

Makoviychuk, Viktor, Lukasz Wawrzyniak, Yunrong Guo, Michelle Lu, Kier Storey, Miles Macklin, David Hoeller, et al. 2021. “Isaac Gym: High Performance GPU-Based Physics Simulation for Robot Learning.” arXiv. https://doi.org/10.48550/ARXIV.2108.10470.

Mannshardt-Shamseldin, Elizabeth C., Richard L. Smith, Stephan R. Sain, Linda O. Mearns, and Daniel Cooley. 2010. “Downscaling Extremes: A Comparison of Extreme Value Distributions in Point-Source and Gridded Precipitation Data.” The Annals of Applied Statistics 4 (March). https://doi.org/10.1214/09-aoas287.

Margossian, Charles C., and Andrew Gelman. 2023. “For How Many Iterations Should We Run Markov Chain Monte Carlo?” arXiv. https://doi.org/10.48550/ARXIV.2311.02726.

Marshall, John. 2016. “Coarsening Bias: How Coarse Treatment Measurement Upwardly Biases Instrumental Variable Estimates.” Political Analysis 24. https://doi.org/10.1093/pan/mpw007.

Martin, Andrew D., and Kevin M. Quinn. 2002. “Dynamic Ideal Point Estimation via Markov Chain Monte Carlo for the u.s. Supreme Court, 1953–1999.” Political Analysis 10. https://doi.org/10.1093/pan/10.2.134.

McCarron, C. Elizabeth, Eleanor M. Pullenayegum, Deborah A. Marshall, Ron Goeree, and Jean-Eric Tarride. 2009. “Handling Uncertainty in Economic Evaluations of Patient Level Data: A Review of the Use of Bayesian Methods to Inform Health Technology Assessments.” International Journal of Technology Assessment in Health Care 25 (October). https://doi.org/10.1017/s0266462309990316.

McCartan, Cory, Christopher T. Kenny, Tyler Simko, George Garcia, Kevin Wang, Melissa Wu, Shiro Kuriwaki, and Kosuke Imai. 2022. “Simulated Redistricting Plans for the Analysis and Evaluation of Redistricting in the United States.” Scientific Data 9 (November). https://doi.org/10.1038/s41597-022-01808-2.

McElreath, Richard, Barney Luttbeg, Sean P. Fogarty, Tomas Brodin, and Andrew Sih. 2007. “Evolution of Animal Personalities.” Nature 450 (November). https://doi.org/10.1038/nature06326.

McMann, Kelly, Daniel Pemstein, Brigitte Seim, Jan Teorell, and Staffan Lindberg. 2021. “Assessing Data Quality: An Approach and an Application.” Political Analysis 30 (September). https://doi.org/10.1017/pan.2021.27.

McWilliams, J. Michael, Michael E. Chernew, Bruce E. Landon, and Aaron L. Schwartz. 2015. “Performance Differences in Year 1 of Pioneer Accountable Care Organizations.” New England Journal of Medicine 372 (May). https://doi.org/10.1056/nejmsa1414929.

Meidani, Mohammadreza, and Behboud Mashoufi. 2016. “Introducing New Algorithms for Realising an FIR Filter with Less Hardware in Order to Eliminate Power Line Interference from the ECG Signal.” IET Signal Processing 10 (September). https://doi.org/10.1049/iet-spr.2015.0552.

Meng, Xiao-Li. 2018. “Statistical Paradises and Paradoxes in Big Data (i): Law of Large Populations, Big Data Paradox, and the 2016 US Presidential Election.” The Annals of Applied Statistics 12 (June). https://doi.org/10.1214/18-aoas1161sf.

Middleton, Joel A., Marc A. Scott, Ronli Diakow, and Jennifer L. Hill. 2016. “Bias Amplification and Bias Unmasking.” Political Analysis 24. https://doi.org/10.1093/pan/mpw015.

Mishra, Abhishek Kumar, Aline Carneiro Viana, and Nadjib Achir. 2023. “Introducing Benchmarks for Evaluating User-Privacy Vulnerability in WiFi.” 2023 IEEE 97th Vehicular Technology Conference (VTC2023-Spring), June. https://doi.org/10.1109/vtc2023-spring57618.2023.10199706.

Mölenberg, Famke J. M., Joreintje D. Mackenbach, Maartje P. Poelman, Susana Santos, Alex Burdorf, and Frank J. van Lenthe. 2021. “Socioeconomic Inequalities in the Food Environment and Body Composition Among School-Aged Children: A Fixed-Effects Analysis.” International Journal of Obesity 45 (August). https://doi.org/10.1038/s41366-021-00934-y.

Mollick, Ethan R., and Lilach Mollick. 2023. “Using AI to Implement Effective Teaching Strategies in Classrooms: Five Strategies, Including Prompts.” SSRN Electronic Journal. https://doi.org/10.2139/ssrn.4391243.

Montgomery, Jacob M., Brendan Nyhan, and Michelle Torres. 2018. “Replication Data for: How Conditioning on Posttreatment Variables Can Ruin Your Experiment and What to Do about It.” https://doi.org/10.7910/DVN/EZSJ1S.

Moran, Mary Ann, Alison Buchan, José M. González, John F. Heidelberg, William B. Whitman, Ronald P. Kiene, James R. Henriksen, et al. 2004. “Genome Sequence of Silicibacter Pomeroyi Reveals Adaptations to the Marine Environment.” Nature 432 (December). https://doi.org/10.1038/nature03170.

Moreau, Luc, and Paul Groth. 2013. “Provenance: An Introduction to PROV.” Synthesis Lectures on Data, Semantics, and Knowledge. https://doi.org/10.1007/978-3-031-79450-6.

Muñoz, Jordi, Albert Falcó-Gimeno, and Enrique Hernandez. 2019. “Replication Data for: Unexpected Event During Surveys Design: Promise and Pitfalls for Causal Inference.” https://doi.org/10.7910/DVN/RDIIVL.

Nottingham, Kolby, Prithviraj Ammanabrolu, Alane Suhr, Yejin Choi, Hannaneh Hajishirzi, Sameer Singh, and Roy Fox. 2023. “Do Embodied Agents Dream of Pixelated Sheep: Embodied Decision Making Using Language Guided World Modelling.” arXiv. https://doi.org/10.48550/ARXIV.2301.12050.

Oberski, Daniel L. 2014. “Evaluating Sensitivity of Parameters of Interest to Measurement Invariance in Latent Variable Models.” Political Analysis 22. https://doi.org/10.1093/pan/mpt014.

“Observational Studies: Getting Clear about Transparency.” 2014. PLoS Medicine 11 (August). https://doi.org/10.1371/journal.pmed.1001711.

Okasha, Samir. 2013. “The Evolution of Bayesian Updating.” Philosophy of Science 80 (December). https://doi.org/10.1086/674058.

Paolino, Philip. 2001. “Maximum Likelihood Estimation of Models with Beta-Distributed Dependent Variables.” Political Analysis 9. https://doi.org/10.1093/oxfordjournals.pan.a004873.

Park, David K., Andrew Gelman, and Joseph Bafumi. 2004. “Bayesian Multilevel Estimation with Poststratification: State-Level Estimates from National Polls.” Political Analysis 12. https://doi.org/10.1093/pan/mph024.

Pianka, John Paul. n.d. “The Power of the Force: Race, Gender, and Colonialism in the Star Wars Universe.” https://doi.org/10.14418/wes01.2.40.

Poole, Keith T. 2000. “Nonparametric Unfolding of Binary Choice Data.” Political Analysis 8 (March). https://doi.org/10.1093/oxfordjournals.pan.a029814.

Prelec, Dražen. 2004. “A Bayesian Truth Serum for Subjective Data.” Science 306 (October). https://doi.org/10.1126/science.1102081.

Pripp, Are Hugo. 2015. “Hvorfor p-Verdien Er Signifikant.” Tidsskrift for Den Norske Legeforening 135. https://doi.org/10.4045/tidsskr.15.0493.

Radwan, Taher M., G. Alan Blackburn, J. Duncan Whyatt, and Peter M. Atkinson. 2019. “Dramatic Loss of Agricultural Land Due to Urban Expansion Threatens Food Security in the Nile Delta, Egypt.” Remote Sensing 11 (February). https://doi.org/10.3390/rs11030332.

Rajkomar, Alvin, Eyal Oren, Kai Chen, Andrew M. Dai, Nissan Hajaj, Michaela Hardt, Peter J. Liu, et al. 2018. “Scalable and Accurate Deep Learning with Electronic Health Records.” Npj Digital Medicine 1 (May). https://doi.org/10.1038/s41746-018-0029-1.

Revicki, Dennis A, David Cella, Ron D Hays, Jeff A Sloan, William R Lenderking, and Neil K Aaronson. 2006. “Responsiveness and Minimal Important Differences for Patient Reported Outcomes.” Health and Quality of Life Outcomes 4 (September). https://doi.org/10.1186/1477-7525-4-70.

Rohwer, Robin R., and Katherine D. McMahon. 2022. “Lake iTag Measurements over Nineteen Years, Introducing the Limony Dataset,” August. https://doi.org/10.1101/2022.08.04.502869.

Rosen, Ori, Wenxin Jiang, Gary King, and Martin A. Tanner. 2001. “Bayesian and Frequentist Inference for Ecological Inference: The r×c Case.” Statistica Neerlandica 55 (July). https://doi.org/10.1111/1467-9574.00162.

Rosenfeld, Bryn, Kosuke Imai, and Jacob N. Shapiro. 2015. “An Empirical Validation Study of Popular Survey Methodologies for Sensitive Questions.” American Journal of Political Science 60 (August). https://doi.org/10.1111/ajps.12205.

Rosenman, Evan, Santiago Olivella, and Kosuke Imai. 2022. “Name Dictionaries for "Wru" r Package.” https://doi.org/10.7910/DVN/7TRYAC.

Ross, Cody T., Richard McElreath, and Daniel Redhead. 2022. “Modelling Human and Non-Human Animal Network Data in r Using STRAND,” May. https://doi.org/10.1101/2022.05.13.491798.

———. 2023. “Modelling Animal Network Data in r Using <Scp>STRAND</Scp>.” Journal of Animal Ecology, November. https://doi.org/10.1111/1365-2656.14021.

Roubik, David W. 2002. “The Value of Bees to the Coffee Harvest.” Nature 417 (June). https://doi.org/10.1038/417708a.

Round, Jeff, Robyn Drake, Edward Kendall, Rachael Addicott, Nicky Agelopoulos, and Louise Jones. 2013. “Evaluating a Complex System-Wide Intervention Using the Difference in Differences Method: The Delivering Choice Programme.” BMJ Supportive &Amp; Palliative Care 5 (August). https://doi.org/10.1136/bmjspcare-2012-000285.

Sain, Stephan R., Reinhard Furrer, and Noel Cressie. 2011. “A Spatial Analysis of Multivariate Output from Regional Climate Models.” The Annals of Applied Statistics 5 (March). https://doi.org/10.1214/10-aoas369.

Salehyan, Idean, Cullen S. Hendrix, Jesse Hamner, Christina Case, Christopher Linebarger, Emily Stull, and Jennifer Williams. 2012. “Social Conflict in Africa: A New Database.” International Interactions 38 (September). https://doi.org/10.1080/03050629.2012.697426.

Sang, Huiyan, Mikyoung Jun, and Jianhua Z. Huang. 2011. “Covariance Approximation for Large Multivariate Spatial Data Sets with an Application to Multiple Climate Model Errors.” The Annals of Applied Statistics 5 (December). https://doi.org/10.1214/11-aoas478.

Saul, Wolf‐Christian, Helen E. Roy, Olaf Booy, Lucilla Carnevali, Hsuan‐Ju Chen, Piero Genovesi, Colin A. Harrower, et al. 2016. “Assessing Patterns in Introduction Pathways of Alien Species by Linking Major Invasion Data Bases.” Journal of Applied Ecology 54 (November). https://doi.org/10.1111/1365-2664.12819.

Schauer, Michael, Ramon Massana, and Carlos PedrÃ³s-AliÃ³. 2000. “Spatial Differences in Bacterioplankton Composition Along the Catalan Coast (NW Mediterranean) Assessed by Molecular Fingerprinting.” FEMS Microbiology Ecology 33 (July). https://doi.org/10.1111/j.1574-6941.2000.tb00726.x.

Schmucker, Christine M., Anette Blümle, Lisa K. Schell, Guido Schwarzer, Patrick Oeller, Laura Cabrera, Erik von Elm, Matthias Briel, and Joerg J. Meerpohl. 2017. “Systematic Review Finds That Study Data Not Published in Full Text Articles Have Unclear Impact on Meta-Analyses Results in Medical Research.” PLOS ONE 12 (April). https://doi.org/10.1371/journal.pone.0176210.

Schmucker, Christine, Lisa K. Schell, Susan Portalupi, Patrick Oeller, Laura Cabrera, Dirk Bassler, Guido Schwarzer, et al. 2014. “Extent of Non-Publication in Cohorts of Studies Approved by Research Ethics Committees or Included in Trial Registries.” PLoS ONE 9 (December). https://doi.org/10.1371/journal.pone.0114023.

Schnell, Sylvia, and Gary M. King. 1994. “Mechanistic Analysis of Ammonium Inhibition of Atmospheric Methane Consumption in Forest Soils.” Applied and Environmental Microbiology 60 (October). https://doi.org/10.1128/aem.60.10.3514-3521.1994.

Schuessler, Alexander A. 1999. “Ecological Inference.” Proceedings of the National Academy of Sciences 96 (September). https://doi.org/10.1073/pnas.96.19.10578.

Seife, Charles. 2015. “Research Misconduct Identified by the US Food and Drug Administration.” JAMA Internal Medicine 175 (April). https://doi.org/10.1001/jamainternmed.2014.7774.

Sekhon, Jasjeet S. 2009. “Opiates for the Matches: Matching Methods for Causal Inference.” Annual Review of Political Science 12 (June). https://doi.org/10.1146/annurev.polisci.11.060606.135444.

SEKHON, JASJEET S., and ROCÍO TITIUNIK. 2012. “When Natural Experiments Are Neither Natural nor Experiments.” American Political Science Review 106 (February). https://doi.org/10.1017/s0003055411000542.

Shen, Yifei, Jiawei Shao, Xinjie Zhang, Zehong Lin, Hao Pan, Dongsheng Li, Jun Zhang, and Khaled B. Letaief. 2023. “Large Language Models Empowered Autonomous Edge AI for Connected Intelligence.” arXiv. https://doi.org/10.48550/ARXIV.2307.02779.

Sih, A., J. Stamps, L. H. Yang, R. McElreath, and M. Ramenofsky. 2010. “Behavior as a Key Component of Integrative Biology in a Human-Altered World.” Integrative and Comparative Biology 50 (October). https://doi.org/10.1093/icb/icq148.

Silvestri, Stefano, Shareeful Islam, Spyridon Papastergiou, Christos Tzagkarakis, and Mario Ciampi. 2023. “A Machine Learning Approach for the NLP-Based Analysis of Cyber Threats and Vulnerabilities of the Healthcare Ecosystem.” Sensors 23 (January). https://doi.org/10.3390/s23020651.

Singh, Ishika, Valts Blukis, Arsalan Mousavian, Ankit Goyal, Danfei Xu, Jonathan Tremblay, Dieter Fox, Jesse Thomason, and Animesh Garg. 2022. “ProgPrompt: Generating Situated Robot Task Plans Using Large Language Models.” arXiv. https://doi.org/10.48550/ARXIV.2209.11302.

Skreta, Marta, Zihan Zhou, Jia Lin Yuan, Kourosh Darvish, Alán Aspuru-Guzik, and Animesh Garg. 2024. “RePLan: Robotic Replanning with Perception and Language Models.” arXiv. https://doi.org/10.48550/ARXIV.2401.04157.

Smaldino, Paul E., Jeffrey C. Schank, and Richard McElreath. 2013. “Increased Costs of Cooperation Help Cooperators in the Long Run.” The American Naturalist 181 (April). https://doi.org/10.1086/669615.

Sobania, Dominik, Martin Briesch, Carol Hanna, and Justyna Petke. 2023. “An Analysis of the Automatic Bug Fixing Performance of ChatGPT.” arXiv. https://doi.org/10.48550/ARXIV.2301.08653.

Song, F, S Parekh, L Hooper, YK Loke, J Ryder, AJ Sutton, C Hing, CS Kwok, C Pang, and I Harvey. 2010. “Dissemination and Publication of Research Findings: An Updated Review of Related Biases.” Health Technology Assessment 14 (February). https://doi.org/10.3310/hta14080.

Song, Fujian, Hooper, and Yoon Loke. 2013. “Publication Bias: What Is It? How Do We Measure It? How Do We Avoid It?” Open Access Journal of Clinical Trials, July. https://doi.org/10.2147/oajct.s34419.

Song, Fujian, Yoon Loke, and Lee Hooper. 2014. “Why Are Medical and Health-Related Studies Not Being Published? A Systematic Review of Reasons Given by Investigators.” PLoS ONE 9 (October). https://doi.org/10.1371/journal.pone.0110418.

Steegen, Sara, Francis Tuerlinckx, Andrew Gelman, and Wolf Vanpaemel. 2016. “Increasing Transparency Through a Multiverse Analysis.” Perspectives on Psychological Science 11 (September). https://doi.org/10.1177/1745691616658637.

Stommes, Drew, P. M. Aronow, and Fredrik Sävje. 2021. “On the Reliability of Published Findings Using the Regression Discontinuity Design in Political Science.” arXiv. https://doi.org/10.48550/ARXIV.2109.14526.

———. 2023. “On the Reliability of Published Findings Using the Regression Discontinuity Design in Political Science.” Research &Amp; Politics 10 (April). https://doi.org/10.1177/20531680231166457.

Tay, C. C. K., A. F. Glasier, and A. S. McNeilly. 1996. “Twenty-Four Hour Patterns of Prolactin Secretion During Lactation and the Relationship to Suckling and the Resumption of Fertility Hi Breast-Feeding Women.” Human Reproduction 11 (May). https://doi.org/10.1093/oxfordjournals.humrep.a019330.

Taylor, Sean J, and Benjamin Letham. 2017. “Forecasting at Scale,” September. https://doi.org/10.7287/peerj.preprints.3190v2.

“This COVID-19 Sleuth Is Making Friends and Foes Advocating for African Science.” 2022. AAAS Articles DO Group, October. https://doi.org/10.1126/science.adf2055.

Tian, Haoye, Weiqi Lu, Tsz On Li, Xunzhu Tang, Shing-Chi Cheung, Jacques Klein, and Tegawendé F. Bissyandé. 2023. “Is ChatGPT the Ultimate Programming Assistant – How Far Is It?” arXiv. https://doi.org/10.48550/ARXIV.2304.11938.

Todorov, Emanuel, Tom Erez, and Yuval Tassa. 2012. “MuJoCo: A Physics Engine for Model-Based Control.” 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, October. https://doi.org/10.1109/iros.2012.6386109.

Tricco, Andrea C., Jennifer Tetzlaff, Margaret Sampson, Dean Fergusson, Elise Cogo, Tanya Horsley, and David Moher. 2008. “Few Systematic Reviews Exist Documenting the Extent of Bias: A Systematic Review.” Journal of Clinical Epidemiology 61 (May). https://doi.org/10.1016/j.jclinepi.2007.10.017.

Trisovic, Ana, Matthew K. Lau, Thomas Pasquier, and Mercè Crosas. 2022. “A Large-Scale Study on Research Code Quality and Execution.” Scientific Data 9 (February). https://doi.org/10.1038/s41597-022-01143-6.

Truong, Charles, Laurent Oudre, and Nicolas Vayatis. 2020. “Selective Review of Offline Change Point Detection Methods.” Signal Processing 167 (February). https://doi.org/10.1016/j.sigpro.2019.107299.

Tucker, Compton J., Jorge E. Pinzon, Molly E. Brown, Daniel A. Slayback, Edwin W. Pak, Robert Mahoney, Eric F. Vermote, and Nazmi El Saleous. 2005. “An Extended AVHRR 8‐km NDVI Dataset Compatible with MODIS and SPOT Vegetation NDVI Data.” International Journal of Remote Sensing 26 (October). https://doi.org/10.1080/01431160500168686.

Ueno, Koji, Toyotaro Suzumura, Naoya Maruyama, Katsuki Fujisawa, and Satoshi Matsuoka. 2016. “Extreme Scale Breadth-First Search on Supercomputers.” 2016 IEEE International Conference on Big Data (Big Data), December. https://doi.org/10.1109/bigdata.2016.7840705.

Uppoor, Sandesh, Oscar Trullols-Cruces, Marco Fiore, and Jose M. Barcelo-Ordinas. 2014. “Generation and Analysis of a Large-Scale Urban Vehicular Mobility Dataset.” IEEE Transactions on Mobile Computing 13 (May). https://doi.org/10.1109/tmc.2013.27.

“UZH in BioNLP 2013.” 2013. Association for Computational Linguistics. https://doi.org/10.5167/UZH-91884.

Valmeekam, Karthik, Matthew Marquez, Alberto Olmo, Sarath Sreedharan, and Subbarao Kambhampati. 2022. “PlanBench: An Extensible Benchmark for Evaluating Large Language Models on Planning and Reasoning about Change.” arXiv. https://doi.org/10.48550/ARXIV.2206.10498.

Valmeekam, Karthik, Sarath Sreedharan, Matthew Marquez, Alberto Olmo, and Subbarao Kambhampati. 2023. “On the Planning Abilities of Large Language Models (a Critical Investigation with a Proposed Benchmark).” arXiv. https://doi.org/10.48550/ARXIV.2302.06706.

VanderWeele, Tyler J., and James M. Robins. 2007. “Four Types of Effect Modification.” Epidemiology 18 (September). https://doi.org/10.1097/ede.0b013e318127181b.

Vermeulen, Marian J, Astrid Guttmann, Therese A Stukel, Ashif Kachra, Marco L A Sivilotti, Brian H Rowe, Jonathan Dreyer, Robert Bell, and Michael Schull. 2015. “Are Reductions in Emergency Department Length of Stay Associated with Improvements in Quality of Care? A Difference-in-Differences Analysis.” BMJ Quality &Amp; Safety 25 (August). https://doi.org/10.1136/bmjqs-2015-004189.

Wake, Naoki, Atsushi Kanehira, Kazuhiro Sasabuchi, Jun Takamatsu, and Katsushi Ikeuchi. 2023. “ChatGPT Empowered Long-Step Robot Control in Various Environments: A Case Application.” IEEE Access 11. https://doi.org/10.1109/access.2023.3310935.

Wang, Guojie, Damien Garcia, Yi Liu, Richard de Jeu, and A. Johannes Dolman. 2012. “A Three-Dimensional Gap Filling Method for Large Geophysical Datasets: Application to Global Satellite Soil Moisture Observations.” Environmental Modelling &Amp; Software 30 (April). https://doi.org/10.1016/j.envsoft.2011.10.015.

Wang, Jun, Lixing Zhu, Abhir Bhalerao, and Yulan He. 2023. “Can Prompt Learning Benefit Radiology Report Generation?” arXiv. https://doi.org/10.48550/ARXIV.2308.16269.

Wang, Wei, David Rothschild, Sharad Goel, and Andrew Gelman. 2015. “Forecasting Elections with Non-Representative Polls.” International Journal of Forecasting 31 (July). https://doi.org/10.1016/j.ijforecast.2014.06.001.

Wang, Xiaoyue, Hui Ding, Goce Trajcevski, Peter Scheuermann, and Eamonn Keogh. 2010. “Experimental Comparison of Representation Methods and Distance Measures for Time Series Data.” arXiv. https://doi.org/10.48550/ARXIV.1012.2789.

Wang, Yue, Hung Le, Akhilesh Deepak Gotmare, Nghi D. Q. Bui, Junnan Li, and Steven C. H. Hoi. 2023. “CodeT5+: Open Code Large Language Models for Code Understanding and Generation.” arXiv. https://doi.org/10.48550/ARXIV.2305.07922.

Warner, Stanley L. 1965. “Randomized Response: A Survey Technique for Eliminating Evasive Answer Bias.” Journal of the American Statistical Association 60 (March). https://doi.org/10.1080/01621459.1965.10480775.

Westergaard, David, Hans-Henrik Stærfeldt, Christian Tønsberg, Lars Juhl Jensen, and Søren Brunak. 2017. “Text Mining of 15 Million Full-Text Scientific Articles,” July. https://doi.org/10.1101/162099.

Williams, Nigel. 1997. “Editors Seek Ways to Cope with Fraud.” Science 278 (November). https://doi.org/10.1126/science.278.5341.1221.

Wilson, David Sloan. 1998. “Adaptive Individual Differences Within Single Populations.” Philosophical Transactions of the Royal Society of London. Series B: Biological Sciences 353 (February). https://doi.org/10.1098/rstb.1998.0202.

Wirtz, Veronika J., Yared Santa-Ana-Tellez, Edson Servan-Mori, and Leticia Avila-Burgos. 2012. “Heterogeneous Effects of Health Insurance on Out-of-Pocket Expenditure on Medicines in Mexico.” Value in Health 15 (July). https://doi.org/10.1016/j.jval.2012.01.006.

Wu, Zhenyu, Ziwei Wang, Xiuwei Xu, Jiwen Lu, and Haibin Yan. 2023. “Embodied Task Planning with Large Language Models.” arXiv. https://doi.org/10.48550/ARXIV.2307.01848.

Xia, Chunqiu Steven, and Lingming Zhang. 2023. “Conversational Automated Program Repair.” arXiv. https://doi.org/10.48550/ARXIV.2301.13246.

Xie, Yaqi, Chen Yu, Tongyao Zhu, Jinbin Bai, Ze Gong, and Harold Soh. 2023. “Translating Natural Language to Planning Goals with Large-Language Models.” arXiv. https://doi.org/10.48550/ARXIV.2302.05128.

Yan, Xifeng, and Jiawei Han. n.d. “gSpan: Graph-Based Substructure Pattern Mining.” 2002 IEEE International Conference on Data Mining, 2002. Proceedings. https://doi.org/10.1109/icdm.2002.1184038.

Yang, Sherry, Ofir Nachum, Yilun Du, Jason Wei, Pieter Abbeel, and Dale Schuurmans. 2023. “Foundation Models for Decision Making: Problems, Methods, and Opportunities.” arXiv. https://doi.org/10.48550/ARXIV.2303.04129.

Yao, Shunyu, Dian Yu, Jeffrey Zhao, Izhak Shafran, Thomas L. Griffiths, Yuan Cao, and Karthik Narasimhan. 2023. “Tree of Thoughts: Deliberate Problem Solving with Large Language Models.” arXiv. https://doi.org/10.48550/ARXIV.2305.10601.

Yu, Tianhe, Ted Xiao, Austin Stone, Jonathan Tompson, Anthony Brohan, Su Wang, Jaspiar Singh, et al. 2023. “Scaling Robot Learning with Semantically Imagined Experience.” arXiv. https://doi.org/10.48550/ARXIV.2302.11550.

Yuan, Haoqi, Chi Zhang, Hongcheng Wang, Feiyang Xie, Penglin Cai, Hao Dong, and Zongqing Lu. 2023. “Skill Reinforcement Learning and Planning for Open-World Long-Horizon Tasks.” arXiv. https://doi.org/10.48550/ARXIV.2303.16563.

Zhang, Shun, Zhenfang Chen, Yikang Shen, Mingyu Ding, Joshua B. Tenenbaum, and Chuang Gan. 2023. “Planning with Large Language Models for Code Generation.” arXiv. https://doi.org/10.48550/ARXIV.2303.05510.

Zheng, Ou, Mohamed Abdel-Aty, Dongdong Wang, Zijin Wang, and Shengxuan Ding. 2023. “ChatGPT Is on the Horizon: Could a Large Language Model Be Suitable for Intelligent Traffic Safety Research and Applications?” arXiv. https://doi.org/10.48550/ARXIV.2303.05382.

Zheng, Qinkai, Xiao Xia, Xu Zou, Yuxiao Dong, Shan Wang, Yufei Xue, Zihan Wang, et al. 2023. “CodeGeeX: A Pre-Trained Model for Code Generation with Multilingual Evaluations on HumanEval-x.” arXiv. https://doi.org/10.48550/ARXIV.2303.17568.